DOCOIBIT BB80BB 



to IBS 126 

AUTHOR 

mis 

INSIIiaiXOK 
SPONS 
PUB DATE 
CONTRACr 
HOTS 

EDBS PRICE 
OESCBIPrORS 




D 6C 7 9 

H03-79-0003 ' 
18Up.; For related docuBents* see TH 800 350-352. 

MFOVPZOB Plus Postage. 

Acaaenic Standards: Competence: Dlffasion: Elementary 
Secondary Fducation:. instructional Design: Hiniaaa 
Conpetenclps: *ninioiisi Coapetency Testing: Progrin 
Rdministrf.tion; Program Effectiveness: Publicity; 
lest construction: Testing Problems: *Testing 
Programs: Test Peliability: Test Selection: Test 
validity: validity 



ABSTRACT 



This resource document represents the integratiDi of 
both prastice and theory related to minimum competency testing (SCTI , 
and is largely based on information sollestea in a nationwide sarvey 
of KCT programs. Chapter 1„ To Impleaent or Hot to Implement MCI, by 
# tiarcy R. Perkins, presents a definition of MCT and aMiscuseion Df 
'the perse.! ved btcefits and costs of i MC? program, Clfapter 2, 
Defining Competencies, by Perkins, presents the basic elements in the 
process of defining competencies, and describes how current progcaas 
are dealing with those issues. Chapter 3, Test Selection and 
Development, by Michael Priestley, dissusses the initial decision to 
select or develop tests, and procedures in test selection and 
development, inclur'ing the establishment of test reliability and 
validity. Chapter U Settlnq Standards, by Paula B. Nassif, dessrlbas 
standard setting strategies, including judgments on items and 
judgments on examinees. Chapter 5, Integratino' Testing with 
instruction, by uary P, Tobln, discusses approaches to using test 
results for remei'ial, diaanos^ic purposes or curriculum developmant. 
Chapter 6, Program Management, bv William Phillip Gorth and Pete: E. 
.Schriber, presents strateqies for planning and personnel needs aid 
costs of a MCT program. Chapter 7, Dissemination, by Schriber ani 
Gorth, focuses on dissemination within and about a MCT program. 



* Rsproductioas supplied bv fdp?? ar« the best that can be mai2 * 

* from the orialral document, * 
*************************************** *********************^ 



er|c Z*-' 



* NATlOMAt iNITlTUte 09 

>MI<| DOCUMENT MA^' BffH »eo»0» 

tH8 PERSON OB OttOANt^ATlONORl^iNT^ 
ATlNOlT l*OiH|T^Of VJt^O» OPINIONS • 

StAtiD 00 NOt NCCf %SAeaY e€P»e» 

%€NTOrilCtAiNAT»C!^Al iNSTitUtiO^ 
: rOUCATlON POSITION 0» POUCV 



The materials contained in this report were prepared for the Jtenonal 
Institute of Education (NIB). Department of Health, Education, and Welfare. . 
wSer extract number (400-79-0003). This contract was awarded December 15. 1978, 
M Se resSt of a competitive bidding, procedure, to National Evaluation Systems. 
Scenes), a fim that has developed and administered minimum con?»tency tests 
under contract to State and local education agencies. 

The purpose of this contract was to obtain paeviously unavailable descrip- 
tive information about minimum competency testing programs for enlightenment 
of educators, researchers, and others interested in this area. Infonnation on 
the consequences or impacts of these programs was not within the scope of worK 
for this contract. However. NIE. is currently planning a complemenUry study 
that will focUs on program in^acts. 

In obtaining the descriptive information presented here.^the NES project 
staff, during the spring of 1979. interviewed the directors of all State 
SSS^ cS^& testing progrins and of 21 local district programs. Subsequent 
to^theso visits. NES staff developed written program descriptions, and these were 
s«it ?o the pro-am directors for verification. It is these verified program 
descriptions that form the basis for this report. 

It should be emphasized that the information presented here provides a 
snapshot of 'the status of minimum competency testing programs as of ;J""e^30. 1979. 
and. owing to the dynamic nature of these programs, may not portray the programs 
as they are operating today. 

Further, it should be emphasized that any opinions expressed in this report 
do not necessarily reflect NIE or HEW position or policy, and no endorsonent of 
iSm conpetency testing or of any model described in this report by NIE or 
HEW should be inferred. 



2) 



A Study of Minimum Competency Testing Programs 



FINAL PROGRAM DEVELOPMENT 
RESOURCE DOCUl^NT 



SUBMIHED BY: 

William Phillip Gorth, Project Director 
Marcy R. Perkins, Project Coordinator 
National Evaluation Systems, Inc. 
30 Gatehouse Road 
Amherst, Massachusetts 01002 



4 



A PROJECT SPONSORED BY: . 

Office of Testing, Assessment and Evaluation 
National Institute of Education 
Dr. Judith S. Shoemaker, Project Officer 



December 1979 



i 



TABLE OF CONTENTS 



Pase 



PREFACE 



1v • 



OVERVIEW 



1 



CHAPTER 1: TO IMPLEMENT OR NOT TO IMPLEMENT MCT 



Introduction 

What Is Minimum Cdnpetency Testing? 

"To Implement, or not to Implement, MCT. . . 

Summary 



CHAPTER 2: DEFINING COMPETENCIES 



Introduction 

Basic Elements In the Process of Defining Competencies 
Summary Guidelines for Defining Competencies: 
Three Examples 



CHAPTER 3: TEST SELECTION AND DEVELOPMENT 



, •Introduction 
Initial Decision: To Select or Develop 
Test Selection 
Test Development 

Establishing Validity -and Reliability 



Marcy R. Perkins 



Marcy R. Perkins 



25 



Michael Priestley 



-ii- 




TABLE OF CONTENTS 



Page 



CHAPTER 4: SEHING STANDARDS 

. Paula M. Nassif ^ 93 

Introduction 

Issues and Parameters ' . 

Standard Setting Strategies ^. 
Judpents on. Items 
Judgnentis on Examinees 
What is Actually Being Done 

CHAPTER 5: INTEGRATING TESTING WITH INSTRUCTION 

Mary F. Tobin 124 

Introduction 

MCT Results and Decisions Related to Curriculum and 

Instruction » 
Options for Organizing Instruction and Remediation 
\ Choosing the Appropriate Arrangements 
• Integratina the Testing Program with Curriculum and 
Instruction 



s ¥ 



CHAPTER 6: PROGRAM MANAGEMENT-. * 

wmiam Phinip Gorth and Peter E. Schriber 139 

Introduction 

Personnel 

costs 



CHAPTER 7: DISSEMINATION 

Peter E. Schriber and William Phillip Gorth 153 

Introduction 

The Planning Process 

Documenting the Plan 



-111- 



ERIC 



C29S 



PREFACE 



This j-esource document represents the •^njefl^**!^" JjJ^^f,!! P^t^^JJed 

ne» to MCT program developers »n(l reviewers. 

While Sherry A. «"'''«*5:"rS»Jre'„t"\!leTlr; To^b^''^^^^^^^ • 

vidual chapters, as well as to their ^ chaptjrs-and 

R. Harris accomplished the 1nve.luable t^^^ s contettt. Dr. Allan 

in some cases contributed to a "J^oj^J^J^J"^"^^^ Mary Tobirt Is 

contributed his expertise c^«PjJ^h^Sro!!%: the doc2«ient is a whole; 



4 



ERIC 



C295 



OVERVIEW 
Marcy R. Perkins 



Introduction 



Because public concern about the condition of the American educational 
system has grown In recent years, more and more programs are being designed 
to assess whether students have acquired some specified set of skills to a 
predetermined minimum level. "This trend toward minimum competency testing 
(MCT) has grown so fast, however, that educational decision makers are 
faced with the problem of designing and Implementing such programs with • 
little Information as to what Issues to consider, what questl ons to ask, 
and what decisions to make, < 

The major purpose of this res'burce document, therefore. Is to provide 
Information to help educatlon'il decision makers on all levels make Informed 
choices about minimum competency testing. The document Is designed to 
present a range of options that have been tried In the fUld, aind to pre- 
sent Issues that have arisen In the course of Implementing MCT programs. 
It can serve as a resource for discussions about minimum competency testing 
or for Its Implementation. 

The document Is likely to be most useful to those for whom a decision 
has been made, on whatever level, to develop and Implement a minimum com- 
petency testing program or to review the adequacy of a proposed or existing 
program. The document Is Intended for a wide range of audiences, from 
state legislators to state education department staff to local district- 
administrators, teachers, land consultants. The goal Is to reach anyone who 
has an Interest In, qr Is {responsible for any part of a minimum competency 
testing program. ' 



C29S 



IiBpfementation Issues to Consider 

Regardless of the purpose or level of Involvement an educator h^s with 
respect to a minimum competency testing program, a thorough consideration 
df such questions as the following may help In making whatever decisions 
fhare are to be made: 

/ • * • 

— What kinds of competencies .shall we define (e.g., life skills, 
basic skills)? 

— Who win have responsibility. for defining the competencies? 

— How do ^we set standards? . 
" What standards shall we set? 

* • . ** • 

— Do we develop or select tests? How do^ do either? 

— If WG develop a test, how do we^ensure Its fairness? 

— Shall we have different' l^ests/standards/competencles for racial 
groups/ethnic groups/special education students/ 11ml ted English- 
speaking students? 

Who Is to administer the tests? . 

— What kinds of "scores do we want to compute? 

— Who do we report results to? 

— Do we disseminate just test results, or the tests themselves? 
How does this decision affect test development? 

— How do we use what money we have most effectively? 

— What Is a good way to m.anage this program? 

— Do we want to build In f ormati ve/sumnatlve evaluation of the 
program? Shall we systematically study the Impacts of our 

. program? 

— How will we know if and when our goals have been met? 

— After MCT, what? 



C295 



« While 1t might be worthwhile to treat all of the Issues In detail, 
and $0 satisfy all the needs of any program developer, the resulting ency- 
clopedic document m^ght no lonaer be timely, and might also be so weighty 
and unwieldy as to function only a$ a 100-pound bookend on program devel- 
opers' sltelves. The topics dIscussedMn this document, therefore, which 
are only a subset of those which could be discussed, were selected on the 
basis of a needs analysis conducted durlna site visits to more than 50 MCT 
programs and on the basis of the needs which program^ developers expressed 
at national conferences. 



' General Chapter Characteristics 

• * 

This' document Is Intended to be nonevaluat1ve» and therefore no slngl 
perspective will be advocated on any 1'ssue. Rather, the salient Issues 
related ^ MCT that have been Identified through the site visits to operat 
Ing programs are described. Since the document Is also Intended to be 
practical. Instead of strictly academic. It will present examples of proce 
dures and materials used by locaVand state agencies to help Illustrate 
what can be done to resolve the Issues under discussion. It' Is Important 
to note, however, that In case$ where specific practices are mentioned or 
materials cited, these references are not In any way endorsements of the 
particular proce4,ures or documents. Finally, the authors herein do not 
assume that all readers are always familiar with the terlinlnology of educa- 
tors and measurement specialists. Therefore, to avoid confusion or ambi- 
guity, technical terms or terms with very specific usages are also defined 
in the conte;(t of the particular chapter In which they occur. 



Document Framework 



Discussions about minimum competency testi ng programs generally 
revolve around the various components of these programs and the activities 
associated with developing these components. In this document, while more 
components and activities are discussed than may be reflected in chapter 
titles, not all possible components or activities are included because of 



-3- 



C295 



space limitations. In-order to help the reader access Information of 
Interest, a number of components generally associated with minimum com- 
petency testing programs are listed below. Next to each Is the chapter 
number In which some discussion of that component can be found. 



Components 



Chapter 



Policy 

Program Purposes v 
Competencies 

Measurement Instruments^ 
Standards 
Target Groups 
Testing Schedule 
Test Administration 
Scoring and Analysis 
Reporting and Dissemination 
Use of Data 

Testing Special Populations 
Remediation 

Pro granr Staffing and Management 
Strategies for Cost Eff ecti veness ^. 
Program Evaluation 



1 
2 
3 
4 
2 
6 
6 
6 
7 
5 

m 

5 
6 
6 



Summary of Remaining Chapters 



Chapter 1: To Implement or not to Implement MCT 



The major intents of this first chapter are to provide a definition 
of MCT that win serve as the basis for the remaining chapters, and to 
present the myriad of issues that have arisen in the field about .whether 
or not MCT should be implemented on any level. The perceived costs and 
benefits of MCT that have been expressed by program personnel, testing 
specialists, and the public are discussed. 



4 



-4- 

f f 



f , 

C295 . . 

ChaptlBr 2^ Defining Competencies 



A. 



The purpose of this chapter 1$ to present Issues related to the defi- 
nition of competencies and to describe how programs In the field are cur* 
rent 1y dealing with the Issuesr. Discussed are a number of questions pro- 
grams are considering that concern the orientation of competencies, who ^ 
may be Involved In the Identification process, and how. validation may take 
place. 



Chapter 3: Test Selection and Development^ 

The primary purpose of this chapter Is. to present Issues being faced 
by programs that are related to making a decision to either select .or 
develop test l^nstruments. Also dl^ussed are the Issuer related to Imple- 
menting either^ dptlon. 



Chapter 4; Setting Standards 

, ■ * 

The aim of this chapter is to deiscrlbe standard setting strategies 
used in the field and to present Issues concernlh^ the selection of ons or. 
another strategy. « 



Chapter 5: Integrating Testing with Instruction . ; 

Since a frequently expressed .goal of rrvf^ilmum competenty testing is to, 
Identify students who need remediation. Chapter 5 discusses approaches to 
using test results for remedial, diagnostic purposes.' It also deals with 
the integration of test results with Instruction and the development of . 
instruction. * , 



C295 



Chapter 6; Program Management 



_jIheJfta^-i'«nw«'TS<^Ts chapter Is to present Issues related to 
\e management of a minimum competency testing prpgram, either at the 
state or local level* A, .discussion of cost effectiveness strategies Is 
also Included, 




Chapter 7: Dissemination 



The Ust chapter focuses on Issues related to dlsseml^natlon within 
and about a minimum competency testing program, and also considers the 
question of how those directly affected by a program can be kept Informed 
of Its activities and how the program can be presented to ,th^ public. 



\ 



* , 



■erJc 



-6- 



'J 



C295 



CHAPTER 1 

to IMPLEMENT OR NOT TO IMPLEMENT MCT 
Marcy R. Perkins ^ 



Introduction 



As mentioned In the Overview, minimum 'competency testing Is a 
fast-growing educational phenomenon that continues to spread even in the 
face of little Information as to how programs may be developed and Imple- 
mented or what effects they may be having. While this entire^ document Is 
Intended to help bridge that Informational gap by presenting some of the 
Issues being faced In the field and discussing the ways In which programs 
are Resolving them, this chapter serves two specific purposes as a pre- 
liminary to the other chapters. 

First, since minimum competency testing "means many thinas to many 
people" (Alraslan, Pedulla, & Madaus, 1978), one Intent of this chapter Is 
to provide a working definition of MCT. This definition, only one of the 
many formulations possible. Is based on the features observed and accepted 
In the field which served as the basis for selecting the programs In the 
study. Second, since this document Is Intended for all policymakers, not 
just for those who have already Implemented competency testing. Issues 
• related to the question of whether or not MCT should be Implemented will 
be discussed In this chapter. Before turning to these, however, a number 
of general points need to be discussed. 

It is assumed here that systematic attempts to consider the Issues, 
both for and against the Implementation of minimum competency testlhg, 
will result in sounder decisions. This does not mean, however, that deci- 
sion makers 1n states and local districts which have already adopted such 
programs cannot benefit from the material presented in this chapter. The 
Issues discussed may serve to shed light on both unresolved issues and 
implementation difficulties that result from the failure of a program to 
deal with the reservations of key individuals or groups. 

Because of the necessary limitations of space, t.h1s chapiter doifes 
not discuss every one of the Issues related to the perceived costs and 
benefits of minimum competency testing. Moreover, no single perspective 



C295 



win be advocated with respect to any of the issues raised, nor will a 
stance be taken on the Issue of whether to implement or not to Implement 
MCT. Finally, those interested in the history of minimum competency 
testing are destined to be disappointed if they search for it here. An 
account of the background and development of MCT Is not likely to be as 
helpful for program developers as a systematic presentation and discussion 
of the strengths and weaknesses of MCT as seen by those In the field. 



What is Minimum Competency Testing? 



"When I use a word," Humpty.Dumpty said, in rather a 
scornful tone, "it means just what I choose it to mean— neither 
more nor less." 

"The question is," said Alice, "whether you can make words 
mean so many different things." 

"The question is," said Humpty Oumpty, "which Is to be 
master— that's all" (Lewis Carroll, Through the Looking Glass) . 



If there is one point upon which all testing specialists, program 
administrators, and educational policymakers agree, it is that there is 
no consistent terminology for minimum competency testing In use in the 
testing field. "Standards" in some programs can mean "competencies" in 
others; "competencies" themselves can be synonymous with "competency 
areas," "objectives," "skill statements," and "performance indicators," 
to cite only a few terms among many. With this wealth of terminology, 
sane of which Is specific to only a few programs, how then Is minimum 
competency testing defined? Are there components which are common to 
all programs? 

Table 1 presents the texts of nine definitions of MCT found in the 
research and policy literature. In the first five, there is a clear 
emphasis on student acquisition of certain minimum skills, and on assess- 
ment of that achievement." In the sixth and seventh definitions, potential 
effects of minimum competency testing, rather than Its strict defining 
characteristics, are delineated. In the last two, the specific components 
and procedures of minimum competency testing programs are presented. Even 
1^ in these, however, the concept of some kind of a standard Is evident. 



-8- 



TABLE I 

Definitions of Minimum Competenoy Testing / \ 

Employed in the Field / \ . 

\ 

Minimum competenoy testing. programs are "organized efforts to mak^^sure 
public school student are able to demonstrate their mastery of certain minimum 
skills needed to perform tasks they wQl routinely confront in adult life." 

(AFSC, 1978) 



Minimum competency tests are constructed to measure the acquisition of com- 
petence or skills to or beyond a certain define^ standard. 

(MiUer» 1978) 



Minimum competency testing programs are "testing programs which attempt to 
leam whether each student is at least ^minimally competent' by the tiiu? the 
student graduates from public schopL" 

(NSBA, 1978) 



Minimum competency testing is "a certification mechanism whereby, a pupil must 
demonstrate that he/she has mastered certain minimal (sic) skills^ in order to 
receive a high school diploma." 

(Airasian et al., 1978) 



Minimum competency testing is "a device to increase emphasis on the three R*s 
or basics." 



(Airasian et aL, 1978) 



C295 

TABLE 1 (continued) 



Minimum eompetenoy testing is "a mechanism for tightening up promotion 
requirements} eerti^ilnip early exit from the sehool system) hol^ educators 
reroonsible for poor student achievementi increasii^ the cost-effectiveness of 
education) identi^ring and remediating pupils who have learning, difficulties; or 
increaidng the public^ confidence in the schools and their graduates," 



(Airasian et aU 1978) 



Nearly ail minimum competency testing programs seek "to define minimum 
learning outcomes for students in a variety of academic areas" and "to insure 
that these standards are satisfied." 

(Cohen & Haney, 1978) 



Minimum competency testing involves: 

(1) the use of objective, criterion-referenced competency tests) 

(2) the assessment of reading and computation using "real life" or "life skin" 
items; 

(3) the requirement of a specified mastery level for high school graduation; 

(4) the early introduction of such testing for purposes of id^iif ication and 
remediation. 

(Elford, 1977) 



Competency-based education (used in this paper nearly synonymously with 
minimum competency testing) lar"a data-based, adaptive, performance-oriented 
set of integrated processes that faciUtate, measure, record, and certify within 
the context of flexible time parameters the demonstration of known, expUeitiy 
stated, and agreed upon leamii^ outcomes that reflect functioning in life roles. 

(Spady, 1977) 



C295 



For the purposes of the NIE study of ninlmum competency testing pro- 
"grams, two features were selected as being distinctive of «CT programs. , 
Programs can. and do, vary wtdely on a great number of dimensions, but to 
be Included In the study, any program under consideration had to have at 
least the following two features: 



(1) the presence of an explicit standard for determining acceptable 
oerf ormance; and 

(2) the use of test results to make decisions about Individual 
students. 



No other features were taken Into account, such as the reasons for 
Initiating a program (e.g., certification of students for graduation, 
grade promotion decisions. Identification of students In need of 

'remediation), or the grade levels set for testing (e.g., high school 
grades only; a mix of elementary, junior high and high school grades, 

. elementary grades only). 

The presence of a standard gives meaning to the concept of pass/fall, 
and so distinguishes MCT from statewide assessments. In the lattfer, 
student achievement may be monitored Individually (although many assess- 
ments use sampling rather than census testing), but not with respect to 
any specific standard; I.e., a student does not pass or fall the tests. 
Student results are generally reported according to groups If sampling Is 
used, rather than by Individuals. If Individual results are reported, 
they are usually Interpreted at the discretion of Individual teachers^. In 
minimum competency testing, by contrast, students are required to achieve 
certain minimum standards of performance; that there are specific conse- 
quences to students for meeting or 'not meeting the standards Is the second 
distinctive feature of MCT. 

In the programs of the study, consequences to students who achieve 
the minimum standards may range from the receipt of a high school diploma 
or certificate of special recognition to promotion from grade to grade. 
Consequences for not meeting the standards can include compulsory enroll- 
ment 1n remedial classes, grade retention, or the receipt of a certificate 
of school attendance Instead of a high school diploma. Regardless of the 
Importance of the consequences or whether they are applied for jiassing vs. 
falling the tests, the fact remains that some kind of consequences are 
present In programs accepted as' minimum competency testing programs. 



. -11- 



ERIC 



C295 



"To Implement, or not to Implement. MCT. 



Minimum competency testing Is, without question, one of the most 
hotly debated subjects In the i^orld of testing today. Proponents make 
strong claims about Its potential benefits, an^ opponents argvje Just as 
strongly about Its potentially harmful effects. It Is not the purpose of 
this chapter to determine, once and for all, the various Impacts of MCT or 
whether they are harmful or beneficial. Rather, the Intent Is to present 
major Issues for policymakers to consider as they make decisions about 
whether MCT will serve the particular goals and purposes established for 
their testing programs. For policy makers on the point of making a deci- 
sion about minlmun competency testing, welghlfiig the advantages and disad- 
vantages of MCT, especially as these relate to a particular program, will 
help to reach decisions that are wejil -Informed and reasoned. One of the 
chief criticisms of MCT programs today concerns the speed with which . 
Implementation has been required, a speed which has* not always allowed 
program developers the time to plan as carefully as they might like. 

Because this chapter Is to be nonevaluatlve and Impartial In Its 
discussions of the Issues, It 1s hard to know which side Qf the contro- 
versy to present first. Beginning with either the pro- or the antl-MCT 
arguments could be construed as presenting, however subtly, a specific 
position on the Issues. Therefore, a decision was made to determine the 
order of presentation by flipping a coin: heads, the pro-MCT arguments go 
first; tails, the antl-MCT arguments go first. The coin turned up heads^ 



t 

Perceived Benefits of Minimum Competency Testing 



Listed In Table 2 are a number of perceived benefits of minimum 
competency testing that have been culled from a'wide variety of sources. 
Including the research literature, MCT program publications, professional 
conference proceedings, 'and personal communications during the site visits 
conducted In this study. Each of these has been cited as a benefit or 
potential purpose or useful effect of mlnlmian competency testing by at 
least one person In the field. Most have been cited any number of times 
as reasons for Implementing MCT either locally or statewide. The benefits 
appear to fall Into a finite set of types: MCT may (1) restore confidence 
In the high school diploma, (2) Involve the public In education, (3) 
Improve teaching and learning, (4) serve a diagnostic, remedial function, 
and (5) provide a mechanism of accountability. 



-12- 



C295 



5 



. TAILS 1 

P«oeivQd Benefits of Minimum Competency Testing 



restores meaning to a high school diploma 



reestablishes public confidence in the schools 



impels us t» face squarely the question of "what. Is a high school education?** 
sets meaningful standards for diploma award and grade promotion 



challenges the validity of using seat time and course credits 
certifying student accomplishments 



certifies that students have specific minimum competencies 



as basis for 



involves the public and local educators in 4ef&iing educational standards and goals 

^ ... 

focuses the resources of a school district on a dear set of goals 

defines more precisely what skills must be taught and learned— for students, 
parentSi and teachers 

promotes carefully organized teaching and carefully designed sequential learning 

reemphasizes basic skills instruction 

helps promote competencies of life after school 

broadeiB educational alternatives and options 



-13- 



erJc 



C295 



TABLE 2 (continued) 



• motivates students to master basic reading', mathematics, and writing skills 
e stimulates teachers and students to put forth their best efforts 
e identifies students lacking basic skills at an early stage 
«t. encourages revision of courses to correct identified skill deficienees 
e' ensures that schools help those students who have the greatest educational need 
e can bring about cohesive: ess in teacher training 
e can truly individualize instruction 
e shifts priorities from process to product 
e holds schools accountable for educational products 

• furnishes information to the public about performance of educational institutions 

» 

e provides an opportunity to remedy the effects of discrimination by identifying 
learning problems early in the educational process 

e provides greater holding power for students in the senior year 

e provides for easier allocation of resour&'ts 



-14- 



erJc 



21 



t 



C295 



Let us consider first the, view that minimum competency testing can 
restore confidence In the high school diploma. It has teen apparent for 
some time that there Is widespread public disillusion and dissatisfaction 
with the quality of American education. Employers complain that appli- 
cants with high school degrees are unable to complete job applications 
correctly. Colleges and universities complain that they must Institute 
remedial reading classes In oraer to raise the reading ability of Incoming 
students to levels high enough Ifor college work. The public points to 
declining test scores as an indication of the Inadequate skills which stu- 
dents possess at graduation. In the light of this evidence, all segnents 
of the public are concerned to know what a high school diploma actually 
certifies about the skills of the student. And MCT Is seen as a way of 
clearly and precisely demonstrating what students can do and of ensuring 
that they have those "minimum" skills necessary to function In society 
(e.g., AFSC, 1978; NSBA, 1978). An auxiliary benefit Is that along with a 
precise definition of skills an^ a demonstration that students Indeed have 
those skills will come a greater public confidence In the educational 
system (e.g., AFSC, 1978; Nlcks^, 1978).: 

According to Walker (1978)1 the main support for MCT has come from 
the public, and the second catejgory of perceived benefits relates to the 
Involvement of the public In educational goal settlna. Proponents of MCT 
cite as one of Its benefits thci fact that responsibility for defining the 
goals and Intended outcomes of a high school education Is shared bv educa-. 
tors and the public (e.g., NSBA, 1978; Nickse, 1978). It Is certainly the 
case that, in most MCT prograds, administrators have considered It Impor- 
tant to Involve representatives from such constituencies as parents, the ^ 
business community, and outside educational organizations. Frequently, 
surveys of these groups have also been conducted for the purpose of pro- 
viding Input to the processes 
and setting standards. 



of defining and/or validating competencies 



The realms of teaching arfid learning comprise a third area In which 
Its proponents consider that inlnlmum competency te.st1ng will have a bene- 
ficial Impact. Since a legal question may arise as to whether one may 
test a skin that has not been directly taught, many supporters see MCT as 
a" impetus to a careful examination of the curriculum in light of the 
.,jals of the MCT program (e.g., AFSC, 1978). Other MCT advocates believe 
that a reemphasis of the basic skills 1s In* order and can be accomplished 
through minimum competency testing (e.g., NSBA, ^1978). Still others, who 
advocate a systems or competency- based approach to education, consider MCT 
to be the means for restructuring curricula to reflect such an approach. 
Finally, there are those who feel that MCT will Increase the motivational 
levels of both students and teachers (e.g., NSBA, 1978). 



-15- 



ERIC 



C235 • 



Related to the hope that MCT will help to 1inpf*ove teaching and the 
curricula 1s the expectation that it will stimulate the establishment of 
remedial programs for' students shown to be deficient in the basic skills 
(e.g., NSBA, 1978; AFSC, 1978; Wjl son, 1976). In many MCT programs, the 
major goal o f testing is to identify those students who need additional 
Instruction; the intended remedy for deficiency is most often remediation. 

Finally, although some MCT programs specifically forbid the use of 
test results for accountabi 11 tvpurposes, accountability is still a live 
issue in the field of educati^ and MCT Is seen as one way of establish- 
ing accountability. Students, teachers, and administrators alike can be 
held accountable for their respective educational responsibilities (e.g., 
Scott, 1978). . 



Perceived Costs of Minimum Competency Testing / 

Enumerated in Table 3 are the perceived disadvantages of MCT which 
arei commonly cited by opponents of minimum competency testing. Like the 
perceived benefits, the perceived costs center on the potential effects of 
MCT on a variety of elements, and these effects are seen to be harmful In 
seme way. Once again, the discussion may be facilitated by grouping the 
points according to the element affected. Therefore, perceived disadvan- 
tages may be seen InAterms of the potential harmful effects of MCT on 
(1) various populations of students, (2) the curriculum, (3) teachers and 
administrators, and (4) control of education. 

With respect to its effects on various student populations, the 
criticisms of. minimum competency testing are several. -Opponents of MCT. 
believe that under achievers, diagnosed as being "below competency stan- 
dards," will suffer from further labeling, especially if the receipt of a 
standard high school diploma is contingent upon passing a competency test. 
On the other hand, it is claimed that average students are unrecognized 
and gifted students go unchallenged 1 n MCT programs (AFSC, 1978). Advo- 
cates of racial, ethnic, or special education* students assert that compe- 
tency testing may promote bias against these groups, especially if school 
systems are believed to be already segregated or discrlmiflfttory against 
these student populations in some other way (Alrasian et aL, 1978; Scott, 
1978). Finally, minimum competency testing may unfairly place the burden 
of failure squarely on the student, rather than making failure a shared 
responsibility of student, teacher, and school system (AFSC, 1978). 



-16- 



C29$ 



TABLE S 

Pereeivdd Costs of Minimum Competency Testing 



i emphasis on the practical will lead to an erosion of liberal eclvcation 

t 

i cfauses less attention to be paid to difficult-to-measiire learning outcomes 

9 

• promotes teaching to the test 

e^ ' Will be the "deathknell for the inquiry approach to education" 

< * 

• oversimplifies issues of defining competencies and standards and of granting 
credentials to students 

• ■• . 

• promotes confusion as to the meaning of the high school diploma when com-, 
potency definition is left to local districts 



• fails to adequately consider community disagreement over the nature and 
» difficulty of competencies ^ 

e will exclude more chadren from schools and further stigmatize underachievers 

'^will cause "minimums" to become "maximums,** thus failing to provide enough 
instructional challenge in school 

• may cause an increase in dropouts, depending on the minimum that is set 
i provides no^co^ition of the "average" student 



-17- 



C295 

' TABLE 3 (continued) 



• fail^ to provide alternatives that can "inspire" average students to exeel°in some 
areas • > ' * 

• ignores the special needs of gifted students^ giving them less opportunity to be 
tihalloiged and to expand t^eir horizons 

e may have adverse impact on a student's future career as a result of a withheld 
diploma 

e may promote bias against racial, ethnic, and/or special needs groups ^ ^ 

e places the burden of "failure" oil the student 
e causes educators |o be held unfairly accountable 

e inten^ies the conflict for educators between humaneness and accountability 

e increases the record keeping burden for administrators 

■ - * ■ • 

. ♦ 

e does not assure that students will receive effective remediation 

♦ 

e does not assure that all of the perceived needs and benefits will be met and 
realized 

0* - ■ • 

e promote the power of the state at the expense of local district autonomy 

e can be costly, especially where implementation |md remediation are concerned 



-18- 



ERIC 



C29S 



"Mlnlmums will become maxlmumsl" Is a commonly expressed fear about • 
the effect of minimum competency testing on curriculum. Host educators — 
admit that It Is difficult to define "minimum competeitcy,* and therefore, 
critics raise questions about what a dlplomi^ can really mean If different 
definitions of competency are derived by Individual local dlsti^cts (NSBA, 
19W)'./ There are also fears that MCT may lead to a narrow and overly 
limited currlculim, because of the emphasis which such programs seem to 
place upon a certain few basic skills and upon tjjose skills which lend 
themselves to definition in measurable terms. . > 

Issues of teacher and school accountability seen by some as benefi- 
cial are seen by others as harmful effects of minlmum^ competency testing. 
Opponents of MCT assert that educators are often held unfairly accountable 
and that nrtnlmum competency testing only serves to Intensify the conflict 
between "humaneness" and "accountability"' In the role of the educator 
(ASCD, 1978; NSBA, 1978). Furthermore, the Initiation of MCT may unfairly 
place additional burden^ upon school teachers and administrators In the « 
form of extra record keeping and,' In some cases, mandatory curriculum 
reform (NSBA* 1978). Already busy school personnel.. In other wqrds, will 
be expected to assume additional roles and^ tasks with the effect, perhaps, 
of decreasing their time to produce enriched curricula, 

— . . . .... • 

Finally, its effect on control of education is seen to be a disadvan- 
tage of minimum competency testing (Nickse, 1978). In many states^ local 
autonomy Is a valiied. prerogative, and MCT mandated on the state level Is 
seen a^ an 4nfr1ngem€nt of that prerogative. Local school districts also 
complain that the states often Impose certain requirements and yet alve 
little or no financial or technical support to help the local districts 
comply. This same argument can also apply at the s a level, slnw In 
some cases the legislature may enact certain requirements and yet fall to 
appropriate funds to support compliance. 



Criticism of a Different Nature 

In addition to the perceived advantages and disadvantages of minimum 
competency testing enumerated above, writers In the field have offered 
other criticisms of a somewhat different nature. Those costs and benefits 
already discussed are generally predicated on the assumptions (1) that the 
Identification and definition of competencies "and minimum standards of 
performance Is a straightforward process, and (2) that principles and tech 
nlques exist for the construction of reliable and valid test Instruments. 



-19- 



C295 



the other criticisms, by contrast, tend to focus on these two assumptions ^ 
and also on the actual implementation procejlures for minimum competency 
testing. * 

With respect to the first assumption, some critics have taken issue 
with the "seductive nature" of the vocabulary used in minimum competency 
testing programs. "Undefined, perhaps undefinable terms ^e used without 
consideration in discussing MCT programs, and it is only when one thinks 
through the meaning and application of such terms that the apparent sim- 
plicity of MCT is stripped away revealing its true complexity* (Airasian 
et al., 1978, p. -21). In conferences held for the purpose of aiding 
partictpants in the identification of competencies, "seme participants 
were surprised and at times disappointed at the lack of consensus regard- 
ing answers to such questions as. 'what are the definable skills which 
adults cannot live without?'" (Miller, 1978). Airasian also raises a * 
concern about the particular selection of competencies by suggesting that 
schools may have promised too much. It is possible', for instance, that 
schools may have attempted to identify and measure competencies that can- 
not be achieved by a majority of students, and Airasian asserts that, if 
this is so, it would be unfair for the schools to expect mastery, and then 
to penalize students for not achieving it. 

The process of setting minjmum standards of performance has also been 
subject to the type of criticism described above in that standards are 
much more difficult to define and agree upon than might be suspected at 
first glance. According to 'a panel established by the National Academy of- 
Education to consider issues on testing and basic skills, "the present 
measurement arts of educational testing are simply'not up to the ambiguous, 
expectations reflected in most state legislation" (NSBA, 1978, p. 13). As 
with the definition of competencies, an infinite, variety of professionali 
• disagreement can occur during the identification of minimum performance 
standards. , ^ 

Those challenginf^the second assumption— I.e., that adequate techno- 
logy exists for measuring competency achievement— point to the problem!^ 
iitherent in validating tests of life skills achievement. A danger already 
mentioned is that of making competencies trivial in order to render them 
measurable. **" 

Finally, implementation issues that are raised typically revolve . 
around the methods chosen to solve such problems as what grades to assess, 
when to apply sanctions for passing or fail i no the tests, what standards 
to establish, who should be involved in planning the program and how to 
•promote their- Involvement, how to deal with students whose native language 



C29S 



Is not English, and how to Integrate competency testing with the curricu- 
lum and with other forms of testing (Greene, 1979). To corslder the vari- 
ous answers to these questions and the reasons for particular answers is a 
major purpose of this document. 



Summary 



Beyond, or even with respect to, the considerations fbr and against 
minimum competency testing discussed above, "school leaders recognize a 
diverse and contradictory set of motivations: td cut spending and to 
raise It, to prove schools good and to prove them bad, to cause curriculum 
change, to help minorities and to legitimize discrimination" (NSBA, 1978, 
p. 31). There Is an old Persian proverb that says: "Where th^re are two 
people, dbbere are at least three opinions." That Is certainly the case In 
the controversy over minimum xrompetency testing, and It Is also the case 
that what appears to be an advantage of MCT according to one person Is a 
disadvantage according to another, and vice versa. Wljat can be learned 
from any discussion then? "The decision of whether or not to Implement a 
minimum competency testing program should Involve a welghlng-ofthe posi- 
tive and negative consequences of -either decision" (NSBA, 1978, p. 19). 
Furthermore, It has been urged "that the primary needs people perceive 
' belTig met by minimal [ sic ] competency programs be articulated and- that 
these needs be examined In the light of whether such programs, as cur- 
rently conceived, actually re^spond to those needs" (Alraslan et al., p. 2). 

A number of authors suggest, then, that program developers analyze 
their own needs, consider both sides of the MCT Issue In relation to those 
needs, and also look Into possible alternatives to competency testing for 
m'eeting those needs. While many advocate using MCT to diagnose students 
for i^emedlation, for example, it has been suggested that "teachers have 
•probably already Identified these students and their problems" (Elford, 
1977-, p. 10). In addition, "the effective use of test data already col- 
lected would seem the most logical approach to early identification and 
' remediation at this level. Local studies could demonstrate the degree to 
which the elementary achievement tests predict later success in the high 
school competency test" (Elford, 1977, pp. 10-11). MCT, in other words, 
may not be the best method by which to collect diagnostic information 
about students who need remedial aid. 



-21- 



C295 



f 



Finally, some^uthors have suggested that. In making a^ecUlon as to 
whether to Implement minimum competency testing, program developers^^wuld 
do well to consider the lead-in time available, the needs of their special 
student populations, and the* funds available. The answers to these ques- 
tions could determine how feasible minlmun competency testing is at a 
particular time, given that It suits all of the other needs of the devel- 
oper. 



-22- 



C295 



References 



Alrasian, P., PeduUa, J., ilJadaas, G. Policy Issues In minimal conipe» 
tency testing and a comparison of Implementation models . Boston: 
Heuristics, Inc., i^/a. 

American -Association of School Administrators. The competency moy«nent; 
♦ Problems an4 solutions . Sacramento, California: Education News 
service, 1978. , 

American Friends Service Committee. A citizen's Introduction to minimum 
competency programs for students" Columbia, South Carolina: South- 
eastern Public Education Program, 1978. 

Cohen, D., & Haney, W. Ml nimums, competency testing, and so cial policy. 
Cambridge, Massachusetts : The Huron Institute, 1978. 

Elford, G. ^A review of policy Issues related to competency testing for 
high school graduation . Paper presented at the meeting of the New 
England Educational l^esearch Organization, Manchester, New Hampshire 
May 1977. 

Greene, L.F. What can minimum competency accomplish? One response. 
National "Elementary Principal . January, 1979, 23-24. 

I 

\ 

McClung, M. S. Competency testing: Potential for discrimination. 
Clearlnghousje Review . August 1977, 439-443. 



Ml 



Her, B. S. (Ed.). Minimum competency testi ng: A report of four 
regional conferences^ St. Louis, Missouri: cemrel, ly/a. 



National Association of Elementary School Principals. Down and out In 
the classroom: Surviving minimum competency. National Elementary 
Principal . 1979, 58(2). 



-23- 




C295 * .. 

/ 

National School Board Association. Minimum competency . A research 
report, 1978- 

Nickse, R. S. Comments on MCT. Proceedings of the Natio nal Conference 
' on Minimum Competency Testing , nwrel. octooer i^/ti. • 

Spady, William. Competency- based education: A bandwagon In search of 
a definition. Educational Researcher . 1977, 6(1). 

Scott, L. Summary of the Fall 1978 Conference of the Nation al Consortium 
on Testing . Cambridge, Massachusetts: Yhe Huron institute, Novemoer 
ITS. 



Walker, D. F. The Impact of MCT on school curricula, teaching, and stu- 
dents* learning. Proceedings of the National Conference on Minimum 
Competency Testing . NWREL, October Ig/s. '■ ' . 

Wilson, H. A. . Two sides to test: Positive, negative . National Assess- 
ment of Educational Progress, June 1976. . 



For consideration of a variety of Issues related to minimum competency 
testing and the development of an MCT program, the reader Is also referred 
to: 

The Minimum Competency Testing Movement. Phi Delta Kappan , May 1978, 
59(9) : entire Issue. 



ERIC 



-24- 

31 



C295 ^ 



CHAPTER 2. 
DEFINING COMPETENCIES* 
Marcy R. Perkins 



Introduction 



"Impenetrability! That's what I say!" continues Hunpty Dumpty In his 
discussion on managing words (Lewis Carroll, Through the Looking Glass ). 
In defining competencies, as In defining minimum competency testlna, 
penetrating the wall of words to get through to an acceptable meaning of 
competencies sometimes seems to be an Impossible task. Accordlna to the 
National Council on Measurement In Education (NCME) Task Force, for 
example, "a review of the elements of competency requirements across state 
and local districts suggests that the rule for defining competency Is that 
anyone can define It In any way they please as long as they state what 
they mean" (Bunda & Sanders, 1979, p* 10). The NCME Task Force goes on to 
assert that no technical definition of competency prevails In the field. 

The purpose of this chapter is to consider the iTsueKcel ated to 
defining competency as a general concept and to competencies as the speci- 
fic statements forming the basis for measurement In a testing program. 
Questions about the people who can be Involved In competency definition, 
the processes that can be followed, and the content, format, and organiza- 
tion of competencies that can be specified will all be discussed In this 
chapter. 

f . ' 

One of the principal reasons that so many definitions of competency 
and so many processes of defining competencies exist is that MCT programs 
vary greatly In their purpose, size, locus of control, history, and poli- 
cies. Consequently, it is very likely that one process or set of answers 
to the relevant Issues will not be appropriate for all programs. The 
Intent of this chapter, then, is to bring out the kinds of Issues that a 
program developer or reviewer is likely to encounter, on the basis of 
situations and occurrences drawn from' ongoing programs, and to present 
potential ways of dealing with the Issues, once again on the basis of 
methods employed in current programs. 



-25 



9o 



C295 



. Assumptions 



While an Implicit assunptlon of this chapter may be that the reader 
Is In some day Involved in developing or reviewing a minimum competency 
.testing program, the nature and presentation of the material does not 
depend upon that assumption. A thoughtful consideration of the Issues 
discussed here might Well assist policy makers In making a decision as to 
whether or not MCT Is appropriate for their purposes. 

The actual measurement of competencies Is the subject of Chaptisr 3» 
However, many^of the Issues that arise during the process of identifying 
competencies also have implications for how those competencies are mea- * 
sured. Therefore, there will be a certain overlap between this chapter 
and the next, and it is recommended that both be read for a more complete 
picture of the activities related to competency assessment. 



Limitations 



Presenting more than a single process or solution with respect ta 
competency definition does not imply that every one of the possible or 
existing processes or solutions will be discussed,. Limitations ..on space 
prevent the fullest treatment of Issues possible. Furthermore, this 
chapter will not discuss competencies in specific subject areas (e.g., 
reading, mathematics, or democratic governance) , nor will it debate- the 
implications of statewide versus local definition of competencies for the 
meaning of a high school diploma. Rather, processes will be discussed in 
a general way, as applicable across subject areas and by various governing 
bodi es . 



Structure * t Chapter 



It is apparent, both in the literature on MCT and in the programs 
surveyed, that most programs have utilized similar procedures and encoun- 
tered similar Issues in their identification of competencies. In some 
cases, the process is an explicitly defined one, developed by the agency 
to facilitate the accomplishment of the task and ensure that the relevant 
issues are all addressed in some way. In this chapter, discussion will 



-26- 



C295 



begin with procedures for competency definition common to programs In the ^ 
field, followed by the connon Issues that program developers have faced In 
implementing those procedures. In conclusion, several examples of overall 
processes or systems for competency develdpment will be presented and dis- 
cussed. Before proceeding with the topic of how competencies can be 
defined, the general concept of competency as It will be treated In this 
.chapter needs to be clarified. 



Competency«-Competenc1es: A Treatment of Terms 



"Competency" appears to be generally understood In the field as a 
level of ability at which the examinee can demonstrate the appropriate 
application of skills to problems or life-role situations (NSPRA, 1978). 
While the concept of application Is not always Included In every defini- 
tion of competency, the notion of a specified or desired performance level 
*1s. That Is what typically forms the basis of the standards determined 
f of th^ competency assessment. 

"Competencies," by contrast, are seen as specific statements of 
desired performance. The Northwest Regional Educational Laboratory's 
conceptualization of competencies, for example, is that "competencies are 
student outcomes which a school systein believes Its students should attain 
before graduation or complet'ion of a course or program" (NWREL, 1978, 
p. 1v). These student outcomes are frequently Interpreted as comprising 
specific learning objectives setting forth those basic academic skills 
deemed necessary for students to acquire. And these types of outcomes 
have been called "objectives," "behavioral objectives," "performance 
objectives,'' "performance Indicators;"^ "standards," "competencies," and 
other terms, dependent seemingly upon the level of detail and amount of 
performance specified. 

Some take the contrasting view that competencies are different "from 
other student goals and objectives in that they describe the student's 
ability to apply basic and other skills In situations that are corranonly 
encountered In everyday life" (NWREL, 1978, p. vi). In Oregon's MCT pro- 
gram, for example, a competency is "a statement of desired student perfor- 
mance representing demonstrable ability to apply knowledge, understanding 
and/ or skills assumed to contribute to success in life role functions" 
(AASA, 1978, p. 45). Still others Interpret competencies as being descrip 
tlons of coninon and useful skills, and make no added distinction as to ^ 
whether these skills are applied, basic, or life-oriented. 



-27- 



C295 



4 



While the Issue of emphasis (life skill vs. basic skill) related to 
defining competencies will eventually be discussed In depth In this 
chapter, the distinction that Is necessary at this point Is between the 
generic use of "competency" and the speclflcuse of "competencies." In 
this chapter, competencies will be used general Iv to mean specified stu- 
dent learning outcomes. No specific emphasis will be assumed, nor will It 
be expected that any particular jimount' of detail Is to be defined. These 
are tss^jes,^h1ch will be treated within the Context of the chapter. 



Basic Elements \i the Process of Defining Competencies 



Defining competencies Is a step In the development of a minimum 
competency testing program th&t provides structure at two levels In the 
program. Competencies can be» used within the program as the basis for 
teaching, testing, or both. In fact, all competencies for a K-12 or hiah 
' school program may be defined, with a. subset of these selected for testing 
at each of the target grades. Defining competencies, therefore, helps to 
provide the overall Instruction/evaluation sequence and scope within the 
program and to Identify the specific domains to be tested. Having a set 
of competencies Is also a prerequisite to determining specifically what 
the tests will measure, in terms of skills, content, and item difficulty 
level. 

A look at how program developers are identifying competencies in 
operative minimum competency testing programs indicates the existence of 
at least eight major steps or components in the process. These Include: 



— deciding whether to develop or select competencies; 

— acquiring resources; 

— establishing a task force or advisory committee; 

— developing a competency framework/skill emphasis; 

— defining competency content domains; 

— writing/selecting; 

— reviewing/refining/va 11 dating the competencies; 

— selecting the final set of competencies. 



Certainly program developers^ are free to select whatever procedures seem 
to be most appropriate for tneir specific programs and particular purposes 




-28- 



C295 



and to apply these procedures, \if1th their OMn staffs or 1n conjunction 
with a contractor. In whatever order or process that makes the most sense 
for their programs. For the purpose of this discussion, however, the 
tasks will be presented In one possible logical order that has been util- 
ized in the field. Issues related to each of the eight tasks will be 
raised and discussed, and, wherever possible, alternative activities and 
ways of applying various procedures will be presented. 



Developing versus Selecting the Requisite Competencies 



The major difference between selecting competencies and writing them 
Is the source from which they are drawn for Inclusion In the program. In 
other words, those charged with Identifying the competencies may nominate 
ones which they have created or which they have drawn from some extant 
competency collection. They may also choose to adapt an existing compe- 
tency rather than to nominate It In Its. original form. All of the other, 
parameters that mitt be specified for the competencies, however— such as 
emphasis, topical domains, number, specificity, etc.— are the same for ' 
both selection and development. Similarly, the review, refinement, and 
validation processes are the same for both. 

the decision to develop or select, therefore, may depend entirely on 
such considerations as the program timelines, resources, and overall goals 
Questions to be asked at the outset, then. Include: 



— Are the purposes of our program such that we know that there 
are no competencies extant that will match them? 

— How much time do we have to identify competencies? 

— What is the status of our resources? 



If the answer to the first question is affirmative, this will entail 
the development of competencies specifically geared to the program,, which 
will have the advantage of ensuring a match between program purposes, the 
competencies, and the' assessment of the competencies; it can also engender 
a sense of ownership in connection with the competencies and the program, 
because of the high degree of the Involvement in this process on the part 
of the developers. The cost In terms of time and money required to iden- 
tify competencies, however, are greater for developing than for selecting 
them; timelines and budgets may therefore preclude the use of this proce- 
dure. ' 



-29- 



C295 



• If 1mp1anentat1on schedules and budgets are restricted, then select- 
ing competencies with their associated assessments may be the more feasible 
approach. The trade-off here, however, is «t the expense of the congruity 
or fit between the purposes of the program and the competencies selected. 

Finally, if it is^-imperative to identify the competencies ininediately 
and in a constricted timeline. It Is most likely that the decision will be 
to select already existing ones. Since the other procedures involved in 
identifying competencies apply regardless' of whether the decision is made 
to develop or select, the remainder of this chapter will treat the two 
together, noting only those points at which they might diverge. 



Acquiring Resources 

♦ • 

♦ ' •' • ' . 

A useful first step in the identification of specific competencies 
for a program is to review sets of objectives that already exis^ in the 
field. Even if the decision has already been made to develop competencies, 
it is still easier to react to existing materials than to create from a 
void. And by reviewing objectivei in programs similar to theirs, program 
personnel can begin to define more clearly what kinds of competencies will 
be needed within the context of their own proaram. How competencies are 
identified, and where the skill emphasis should be, can also depend upon 
the resources available to a particular program. Before such a review can 
occur, however, resources must be obtained. 

The task of acquiring resources is typical Iv undertaken by the r''o- 
'gram director(s) 'and can be done even prior to .the establishment of . a 
competency testing program. In this case, the existence of appropriate 
performance outcome statements may well affect the initial decision to 
implement a testing progi^am. 

the ways for obtaining competency statements and sources from which 
they are available are numerous and varied. For programs concerned with 
matching; competencies and their assessment to existing curricula, lists of 
skill statements or objectives can be acquired from local schools through- 
out the state or district. In the North Carolina state MCT program, for 
example, the final set of reading and mathematics competencies 1s based 
upon a collating and ranking of objectives collected from all parts of the 
state. While this process appears to be straightforward, reviewers are to 
b^ forewarned that matching competencies to a diversity of curricula can 
be no small task. 



-30- 



C295 



In Other states, 'lists of competencies that" reflect state goals (often 
with specific performance Indlcatctrs and sample assessments) are obtainable 
from the State Department of Education. The Utah Board of Education, for 
ex^ple, appointed subcommittees to develop sample objectives and perfor- 
mance Indicators which were to be made available to local districts. 
Stinllarly/the California and Illinois Departments of Education provide 
local districts with technical assistance manuals which include statements 
of competencies. The advantage of these types of guides Is that they 
present objectives which are matched to state goals and which are perhaps 
broad enough to be applicable to most curricula within the state. They do 
run the danger, however, of being to0 broad in scope to be useful to 
specific programs. 

Finally, commercial objective banks are sources from which competency 
statements may be drawn. The NWREL has developed a listing of available 
collections of objectives which Is Included In the "Outcomes" section of 
their Guide to Identifying High School Graduation Competencies (1978). In 
order to help planners select collections that will be most useful to them, 
this listing provides the following Information about each collection that 
It references: title, description, originator. Intended users, purpose/ 
content, usefulness In relation to competency-based education, history of 
development, related materials, and ordering Information. 

The acquisition of competencies by one or all of the methods men- 
tioned calls for yet another decision as to who wllfbe responsible for 
the prpcess of^ review and for the final selection of competencies for the 
program. 



Establishing a Task Force or Advisory Committee 



In general, identifying competencies 1s accomplished by an advisory 
conmlttee, often representing a cross section of the state's or district's 
educators, administrators, and consumers of education (such as paif-ents, 
students, or* business people). In programs that e^ect to contract with a 
testing agency for their competency definition and test development, the* 
responsibility of overseeing that work and guiding the development of the 
competendesL still rests with the program staff and/ or advisory committee. 
Frequently, the local or state board of education 1s responsible for offi- 
cially adopting or approving a set of competencies, but the first question 
is how to determine the composition of the set which will be submitted for 
approval . 



C295 



While It Is not the case that a task force must be established within 
a competency program, program developers have usually found It advantageous 
to do so. In general, a greater feeling of ownership In the program,. with 
a subsequent higher probability of program success, occurs when those who 
will be ^Jlrectly Involved In or affected by a program participate In Its 
development. , 

there are a number of questions to consider,, however,. In selecting -the 
competency task force. These Include: 



— Will there be one group or moret 

— What will the composition of the group (s) be? 

— What will the size of the aroupU) be? 

— How will the members be selected? 

— What will be the responsibilities/tasks of the members? 



Let us consider each question In turn. 



/ Will there be group one or more? The answer to this question Is 
likely to depend upon other program parameters, such as the overall size' 
of /the. program and the number of competency areas that have been selected 
foif assessment. If, for example, the competency areas selected are 
numerous, a task force' to concentrate In each area may b#des1red. Even 
with as few as two competency areas, separate task forces or subgroups of 
a /larger committee may be sought to represent each of the subject areas. 



What will the composition of the qroup(s) be? The Ohio Department of 
Education recommends In its Competency Handbook (1978) that competency 
dommlttees be composed of "administrators, classroom teachers and educa- 
tion specialists." The Colorado Department of Education similarly recom- 
mends Involving "teachers representing different areas" (Colorado, SDE, 
1975), and the Illinois Office of Education (1978) sugaests that committee 
membership might Include representatives of the community (e.g., opinion 
leaders), a cross section of 'local groups Interested in the program, 
representatives of ethnic and cultural groups, parents, school staff, and 
students. 

» 

It is clear that the options^for committee membership are numerous 
and varied; tiFie ones ultimately chosen may depend on whether the program ' 
is to be developed at the state or local level, the resources available 
for the Individual program to draw, upon, the kind of expertise desired, 
and the amount of community Involvement desired. The consequences of 



-32- 



C295 • 



passing the competency tests may also help to dictate the composition of 
the advisory group. If high scliool graduation ddpends upon mastery of the 
competencies, for example, the more defensible the identification process 
will have, to be. Involving a representative sample from different regions 
of the district or state, from different ethnic backgrounds, from different 
socioeconomic levels, and from different levels of the educational admin- 
istration will help to ensile that the process Is a legally defensible and 

politically acceptable one.° 

^- . . ■ ' 

^' * ' ■ • ^ 

What win the size of. the group(s) be? The Illinois Office of Educa- 
^ .tlon. 1-n Performance Indicators for Competency Assessment (1978), suggests 
that task forces nave 15-25 members, with a number of alternate members, 
and operating programs have typically had committees of 10-20 members 
(e.g., Massachusi^tts, New Jersey, Mar viand). Here again, available 
resources, type of representation desired, and manageability are factors 
to consider In determining how large a group Is desirable. 



How will the member^ be selected? Procedures for the selection of 
the competency task force committees can Include appointment by the state 
or local board of education; appointment by the local or state superinten- 
dent, program coordinator, or other school administrators; random selec- 
tion by the superintendent or coordinator of repi^esentatlves of various 
groups; and open Invitation to varlious groups to obtain their participa- 
tion. The specific procedure selected', according to the Illinois Office 
of Education, needs to be "defensible to the public and conducive of 
efficnsnt task force operations," and "patterned after the selection 
procedures typically used by the local district for selecting members of 
other advisory groups" (Illinois, SOE, 1978, p. 9). 



What. will be the responsibilities/tasks of the members? In general, 
competency task forces are charged with identifying and recommenjding a set 
of competencies upon which to base assessment. Their specific functions 
will vary depending upon the purposes of individual programs and the 
reasons for which Individual members may have been appointed. For example, 
members who represent constituencies within the community may provide Input 
from those constituencies to the process of defining competencies. Commit- 
tee members may also review existing competency sets, review competencies 
developed by a contractor, and/or develop their own competencies. 



-33- 



ERIC 



4n ■ 



C295 ^ ^ 



Developing a Competency Frawework/Skm Emphasis 
# ♦ 

The basic skills/life skills distinction *^ Developing a competency 
f rameMork »or skill. emphasis really means coming to agreement within the 
program on the Issue of the appropriate context for the competencies. And 
this relates t)nce again to the primary purposes of the proaram* Is mastery 
of the competencies to. certify that students possess certain minimum aca- 
demic skills upon ccSnpletlon of a particular grade (basic skills approach), 
or Is mastery to Indicate that students have the skills necessary for 
adulthood and the situations they are likely to encounter as adults (life 
skills abroach)? 

« To state It simply, 2>adio bHIU are those skills which parents and 
society In general expect students to learn and use In school, e.g., read- 
ing, writing, and arithmetic. Life ekille may be these same (basic) skills 
applied In a "life role" or non- academic context, e.g., reading newspapers 
(Instead of textbooks), filing out job applications (Instead Of writing 
book reports), and adding grocery tapes (Instead of lists of abstract 
nuinj)ers). Or, life skills may Include additional skills not generally 
considered school skills, e.g., using a telephone, administering emergency 
first aid, and learning to use a voting machine. A minimum competency 
testing program, depending on Its purposes and emphases, may Include any 
one or all of these approaches to defining those competencies In which 
students must demonstrate mastery. 

The Board of Regents In Rhode Island, as one example, distlnaulshes 
among three levels of educational achievement: basic skills, minimum 
competency, and standards of excellence. Basic skills In Rhode Island 
coiiiprlse specific skills In reading, language arts, mathemat1.cs, and 
cultural arts. Minimum competency, on the (other hand. Is defined as the 
achievement of certain basic life skills, or competency In everyday tasks; 
these tasks are still organized, however, according to the domains of 
reading, mathematics, language arts, and cultural arts. The standards of 
excellence, not yet an Integral part of Rhode Island's developing program, 
are considered to be advanced life skills reflected In outstanding scho- 
lastic and cultural achievement. 

The State Department of Education In Nebraska breaks dowil the domain 
of potential skills for assessment In a somewhat different way. In devel- 
oping the N-ABELS tests, the Department dist * -'gulshed among Ufe-coplng 
skills, basic skills, and essential leamlni, skills. Life-coping skills 
are conceptualized In Nebraska In much the seme way as minimum competency 
In Rhode Island; th6y are defined as those applied skills such as balanc- 
ing a checkbook and completing a job application. Nebraska's basic skills 
are simiUr to Rhode Island's in^hat they a-e considered to be skills used 



-34- 



C295 ' 



primarily In a school setting. Essential skills In Nebraska, however, are 
conceptualized as "a subset of the 'basic skills' which are fundamental to 
continued learning. Essential Tearnlna skills are the tools of learnlna 
necessary for successful acquisition of competencies In the broader skill 
areas" (Nebraska, SDE, 1977, p. 1). 

9 

In contrast to the programs In both Rhode Island and Nebraska, the 
Georgia program emphasizes life skill assessment. According to the policy 
In this state, "the State Board of Education defines as a major role of 
the public schools the responsibility to ready the children and youth of 
Georgia for contemporary life roles." The Competency Performance Standards 
are therefore defined in terms of five life roles: .Learner,- Individual, 
Citizen, Consumer, and Producer. 

The point which these three programs Illustrate Is tha^t, while many 
different labels exist for types of skills, the differences among them are 
actually superficial. It appears Instead that the skill emphasis (life 
role versus basic) Indicates less about the actual skills to be assessed 
than about the context within which they are to be tested. It may be. In 
other words, that the same reading skills are Invoked when students read 
textbooks (basic skills) as when they read and respond to newspaper want 
ads (life jkllls). Important to keep In mind then, during this process of 
competency Identification, Is the relationship between assessment and the 
way In which competencies are defined. 



Basic skills to life skills: Issues regarding emphasis . Other 
Issues which the committee may need to consider in specirying an appro- 
priate framework relate to competency measurement, curricula, potential 
legal problems, and public acceptance.. 

First, tnr-siature of the definition of a skill or competency will 
affect the choice of\,assessment procedures by which to measure student 
achievement. If the competency 1^ defined as being able to deposit money 
In a savings account, for example, then the Ideal form of assessment Is to 
require a student to go to the bank and deposit a sum of money Into a 
savings account (presumably the student's own). That ^ort of real-world 
assessment can be difficult, time-consuming, and costly. A close approxi- 
mation could be to present students with a simulated deposit situation and 
require them to fill out bank deposit slips. The result, however. Is that 
test Item validity can be called Into question, and. Indeed, the process 
of validating the competencies themselves may provide results that can be 
called Into question. 

For programs that opt to define their competencies on the basis of 
their curricula or to structure their curricula to match their competent 
cies, life role competencies can present a problem. Airasian et al. 



-35- 



,C295 



(1978), for example, question whether we currently have the understanding 
and technology to teach strictly life-oriented competencies, at least to 
the degree that we can go "on record assuming the major responsibility for 
fostering the selected competencies" (p. 29). The NWREL (1978) also 
points out that the life-role competencies are difficult to Identify and 
agree upon, that they will perhaps require a change 1n the Instructional 
program, and that they may possibly be so Interdisciplinary In nature that 
currlcular change or Integration Into the curriculum may be extremely 
difficult to accomplish. 

Related to the problems of measurement already discussed Is the 
potential for legal challenge offered by programs In which the competen- 
cies are either not directly taught In the curriculum or not established 
as being valid. Strictly life role Competencies are particularly vulner- 
able to this charge since they can be the most difficult to validate and 
to Incorporate In the curriculum. 

^ Finally, both Alraslan et al. (1978) and the NWREL (1978) suggest 
that the competencies selected for a program should have a broad base of 
public support, especially given the fact that the Impetus for competency 
testing has come largely from the public sector. Ways of ensuring this 
kind of support Include Involving representative audiences In committees 
and submitting recommended competencies to^a general public review. 

Following the consideration and discussion of the above Issues, one 
useful approach for the committee to follow from here Is to come to a 
consensus on which general emphasis is desired, and then further delineate 
domains for assessment. The latter can be accomplished by specifying 
first those domains to which. In the committee's view, all students have 
been exposed by the time of testing, and second, those additional domains 
that represent Ideal learning outcomes. Then the committee will be ready 
to define the content domains for competencies more specifically. 



Defining Competency Content Domains 



Organization or topical outline . In general, some kind of topical 
outline by which to orqanize the competency domains may serve as a useful 
beginning point for writing or selecting specific competencies. A topic 
outline or set of goal statements is a general plan for organizing the 
more specific competencies. In its simplest form, the outline may require 
only a few category headings to identify subdomains, which can then be 



-36- 



ERIC 



/ 



C295 



further defined by competencies. For example, a general outline for a 
language arts test may begin with these categories: 



I. Decoding 

II. Vocabulary 

III. Writing Skills 

IV. Reading Comprehension 

V. Reference Skills 

From this beginning, the outer limits of the domain begin to appear; with 
each subtopic or competency added under one of the category headings, the 
shape of the domain becomes more focused. 

One consideration to keep In mind here, perhaps. Is how the test 
results are to be reported. When devising a topic outline, programs often 
find It both convenient and Informative to report student scores In terms 
of domains or subdomalns, rather than just, by total score for the subject 
area. Any number of other organizational strategies are also available 
for generating some kind of topical framework. 

It may be the case, for example, that the competency areas, and 
perhaps some of the specific skill statements, have already been set by 
the legislature, or by the state or local board of education. In this 
case, gaps may only need to be Identified and filled In, according to the 
purposes of the program and Its relationship to curricula. Other possi- 
bilities include taking over a scope and organization from another agency, 
adopting some form of skill taxonomy, analyzing preexisting curricula and 
syllabi for an overall frtmework and determination of scope, adopting a 
framework Identified In national studies, and analyzing the nature and 
structure of skills typically required In various life roles (NWREL, 1978). 
In most cases, the purposes and already determined policies of the program 
can help to determine which approach might be most appropriate. 

Since the emphasis in Kanawha County's .testing program, for example. 
Is on its interrelation with the curriculum, competencies were identified 
from the instructional guides and programs already In use in the school 
district. Both the Arizona and Ohio Departments of Education recommend 
that competency scope and sequence be linked directly to lo^l district 
program goals (Arizona, SDE, 1979; Ohio, SDE, 1978). The Illinois Office 
of Education suggests that loqal districts Identify priority i^ategories 
for competencies as a first step to selecting or developing th^ (niinois, 
SOE, 1978). And the Colorado Department of Education identifies ways of 
categorizing objectives In taxonomic domains, with attention to encouraging 
the development of . higher-order cognitive objectives (Colorado, SOE, 1975). 



-37- 



C29S 



Assessment parameters . After the committee has come to an agreement 
•on a competency framework or topical outline, the members will need to 
determine the parameters of assessment, since these decisions can help to 
provide them ^Ith guidelines for their actual Identification of competen- 
cies. Issues for the committee to consider, revieM, or come to consensus 
on may Include: 



— How many competencies are to be generated? 
^How specific or general are the competencies to be? 

— What are the time limitations on the test? 

— How many competencies per domain should be identified? 

— How many test items per competency will there be? 



Frequently, time allocations for testing are predetermined within a 
program, so that it is the task of program personnel to identify competen- 
cies (and later, tests) which can meet their purposes within the specified 
amount of time. And because of those limits, trade-offs between the number 
of competencies and the number of items per competency are often necessary. 
One potential problem to be aware of in making this trade-off is that 
committee members, because they are concerned about subject coverage, can 
often be resistant to limiting the skills covered by the test. 

There are two possible results of this problem, both presenting some 
difficulty to assessment and the program as a whole. First, so many 
overly specific objectives might be defined that dissemination and accep- 
tance of them would be difficult to effect, and assessment options would 
be restricted. With respect to the latter, for example, suppose under the 
domain of reading comprehension that the following objective was defined: 
"The student shall be able to identify the main idea of a newspaper 
article." This restricts assessment to a particular type of question 
("What is the main idea of this article?") and a particular type of item- 
related content (a newspaper article). Moreover, a case can be made that 
students should be able to read and comprehend all aspects of movie bills,- 
street signs, advertisements, textbooks, magazines, and various other 
notices. Either a large number of competencies must be identified to 
cover all of these circumstances deemed Important, or the original compe- 
tency needs to be made less specific. 

The second possible result may be a tendency to make the objectives 
too general, with the aim of increasing the types of test items that can 
eventually be matched to them. The problem here is that objectives may be 
made so general that they will provide no guidelines for appropriate 
assessment, with t\\e consequence that no reasonable number of items could 
possibly assess the competency's domain adequately. 



-38- 



C29S 



With regard to time limits for testing, the total test must be cpn- 
s1 tiered when determining both the number of competencies to Identify and 
the number of Items to use for testing. According to the state ofUhe art 
in testing, one minute per conventional multiple-choice Item is a general 
rule of, thumb, and a minimum of four Items per objective Is acceptable In 
order to meet the minimum for stable reliability estimates (Rubinstein & 
Nassif, 1977; Schooley et al., 1976). Therefore, within these boundaries, 
a typical one-hour test can measure 15 competencies. If a longer test Is 
possible, then more competencies may also be specified; If more competen- 
cies are necessary or desired, then the effect on test length and time for 
administration must be considered. While gauging the appropriate level of. 
specificity in order to write or select competencies that can be measured 
In four or so Items Is mostly Intuitive, practitioners report a surprising 
degree of agreement when the Issues are clearly understood (Rubinstein & 
Nassif, 1977). - . 



Writing/Selecting 



The probabit outcomes of the procedures outlined In the previous 
section are commtttee agreement on a number of Issues (test parameters, 
competency specificity and scope, competency organization) and perhaps a 
preliminary specification of a number of competencies.. The major task now 
Is to Identify those competencies which are probable candidates for Inclu- 
sion In the final set. As noted previously, committee members can nomi- 
nate competencies which they have created, or they may nominate competen- 
cies from available sources. They may also choose to revise or adapt an 
ex1st1n.g statement to meet a specific purpose, which Is a combi nat1 or^ of 
the two processes. 

Regardless of the method used, an initial set of competencies can be 
generated as a first step. With the parameters and issues in mind that 
were discussed in the previous section, members may nominate or write. 
Individually or in a group session, any and all statements that, in their 
view, fit the specif icattons. Then, review and discussion can occur to 
settle disagreements about content and to ensure that no gaps remain vhat 
need to be filled. At this point, statements that are essentially geared 
to the same competency may be combined to bring the'number of objectives 
to a more manageable level. Referring to the example described in the 
previous section, for instance, several competencies may be combined to 
read: "The student shall be able to read and comprehend materials typi- 
cally encountered In everyday life (e.g., newspapers, magazines, adver- 
tisements, etc.)." 



-39- 



C295 

1 



At this point, too, a number of specific Issues related to the struc- 
ture, phraseology, and taxonomic level of the competency statements and 
the Implications of this for assessment come to the fore. First, one 
standard structure that can be used for generating objectives Is Mager's 
(1962) model. In which an objective has three components: the condition, 
the performance, and the standard. The condition refers to the given 
situation to which the performance Is related; the performance Is the task 
or skill to be demonstrated; and the standard Is the criterion for judging 
whether or not the examinee has met both the condition and the performance 
(In the case of multiple- choice Items, the standard Is always to choose 
the correct response iFrom the alternatives provided). These components . 
specify the particular parameters within which the assessment can be con- 
ducted. " * 

The second Issue relates to the nature of the verb that Is chosen for 
any given objective, an Issue that Is Important to consider since the verb 
will govern the meaning of the objective and the nature of the Items that 
can measure It. Verbs such as "describe" or "discuss,'' for example, 
suggest measures other than multiple-choice Items; It ^nay be that other 
types of measurement besides multiple-choice Items are desired, but that 
question Is one to consider carefully. Verbs such as "demonstrate" suggest 
observable performance but do not specify the nature of the performance, 
and such verbs as "know" or "understand" Involve unobservable behavior. 
Items appropriate for either of these cases- are difficult to identify. It 
Is generally advisable, therefore, to select verbs which represent actions 
that can be tested by the types of Items desired for th^ test. 

Finally, the taxonomic level of each competency Is a factor to con- 
sider when making judgments about the appropriateness of each objective to 
the grade and skill level for which It Is Intended. "Taxonomic level" 
refers to a classification of sxllls (cognitive, affective, or psychomotor) 
used to Identify the level of, for example, cognitive thought required to 
demonstrate a particular behavior. Objectives for the third grade, for 
example, are more likely to dea3 with knowledge and comprehension, which 
are relatively simple levels of cognitive process on Bloom's (1956) taxon- 
omy, than with synthes^^s or evaluation. Verb selection also relates to 
taxonomic level since verbs can be chosen to reflect specific, desired 
levels and will influence the type of assessment possible. Verbs such as 
"define" and "identify," for example, relate to skills at the knowledge 
level while those such as "apply" and "generalize" can be used in relation 
to higher-order application skills. 

Studying examples of competencies that have been identified for 
different taxonomic levels, different grade levels, and different purposes 
is one way in which to gain familiarity with the concepts in order to 
apply them to the situation at hand. 



-40- 



C29S 



Revlewlng/Reflnlnq/Valldatlng the Competencies 

♦ • * 

Once an initial number of competencies has been generated, either by 
the ^election of existing competencies or the development of new ones or 
by sane combination of both methods, a process of competency review and 
refinement is generally warranted. In conducting such a review, program 
administrators may choose to utilize the sme staff who identified the 
objectives, or they may organize a separate review committee, (and the same 
Issues of size and selection pertain here as they did earlier). Reviews 
may -be carried out within the committee or agency, or the competencies may 
be validated through external reviews by the public, other educators, or 
other professionals! Reviews may be accomplished ^at meetings or through 
the use of more formal instrunents such as survey questionnaires. How 
each of these issues is resolved again depends .upon the purposes of the 
Individual program, the degree of external or internal approval that is 
either desired or required, available time, and available resources. 

As one example, if input is desired about the relationship of the 
competencies to skills required in a particular occupational field, then a 
rating of the objectives by specialists in that field woqld be appropriate. 
In this case, questions like the following could be asked about each com- 
petency: * * 

How often is the skill used on the job? 

— How does the skill relate to emerging fields within that disci- 
pline? 

— How Important is the skill considered to be, whether it 1s^ 
currently taught or not? 



Public acceptance of and involvement 1n an MCT program may also be 
obtained through external reviews of the competencies by citizens, 
teachers, parents, and representatives of the business community. In this 
case, the questions may be of a broader nature, particularly if the compe- 
tencies are intended to reflect life-oriented skills (e.g., "Do you think 
ninth-graders should know this?"). 

Whether or not extensive public surveys are selected as a means to 
competency validation, reviews by content specialists are frequently con- 
ducted to ensure content validity of the competencies. Additional canmit- 
tees of content specialists may be formed for this purpose, or locally or 
nationally known^specialists may be asked to react to the materials. 



-41- 



C29: ' 



, Table 1 presents examples of the types of criteria that can be adapted and 
utilized In programs for such reviews. These criteria represent the tj^es 
of Judgments that programs t^lcally make about skills statements, whether 
the judging Is accomplished formally or Informally. 



Selecting the Final Set of Competencies 



On the basis of the results of reviews conducted, the competencies 
can be refined and finalized. At this point, another round of reviews may 
be conducted, or additional Input may be solicited from various groups If 
the need for either Is felt. If not,. then the final product \% complete 
and ready for Implementation with the competency testing program. 



Summary Guidelines for Defining Competencies; 
. Three Examples 



Whether a choice Is made to follow an explicit model such as those 
presented In this section or to deflfie a unlgue process and set of proce- 
dures win depend upon the particular program— Its purposes, timeline, 
resources, staff experience, and staff Interest. The purpose of this 
section Is to provide additional resources upon which administrators may 
choose to draw. 

Presented briefly are frameworks for competency definition estab- 
lished by the Illinois Office of Education, the Ohio State Department of 
Education* and the Northwest Regional Educational Laboratory. 



Illinois 



Presented in Figure 1 is a process for defining competencies that was 
constructed by the Illinois Office of Education as a resource for local 
districts which opt to implement minimum competency testing. In Illinois, 



-42- 



ERIC 



C29S 



TABLE 1 

Criteria for Reviewing Competeneies 



Teaehability 

(1) b it possible for the schools to teaeh the knowledge, skills, and/or attitudes 
described in the competeneies? 

(2) Is curriculum available related to the individual competencies? 

(3) Will remedial programs, if needed, be available now cr in the near ftiture? 



Acceptability 

(1) , Do the competencies represent reasonable standards of proficiency to be 

required of all students? 

(2) Are the competencies agreed to be necessary outcomes for student success in 
school or their daUy lives? 

(3) Are the competencies reasonable, appropriate and important outcomes of the 
total educational^experience? 



Bias 



(1) Are the competencies free of statements that suggest that some social, 
occupational or Ufe roles should be valued more than other roles? 

(2) Are the competencies free of bias related to sex, race, age, region, religion, 
ethnic, or cultural background? 



-43- 



Rir 



C295 

TABLE t (continued) 



Oeneralizabttlty 

* * • ^ ■ 

(t^ Are the eompeteneied achievable regardless of students' sex, socioeeonpmie 
status, raee, rural o^ urban setting;, and religious belief? 

Will all students for iwhom a particular competency is applicable be exposed to 
sufficient instruction to achieve the specified knowledge, skill or attitude? 

Are the competencies appropriate for those students who transfer within the 
state? 




ft 



A^ there available and acceptable ways to measure the outcomes specified by 
the^competeneies? 

Can the comp€;tencies be measured within the schools' time constraints and 
resourbes? 

(3) WQl ademiate educational resources <e«g., time, staff, money, books and 
materials) be made avaUable to support the implementation of the 
competencies now or in the near future? 

(4) Are the competencies free of specifications which ^ would require special 
equipment or facilities which are not available to most students? 



Validity 

(1) Does the content of the competency fit the intent of the topic or goal 
statement for which it was written? 

(2) Can items be developed for the competency to measure the domain intended 
by the topic? 



:RIC 



-44- 

5 1 



TABLE 1 (continued) 



(3) If competencies are to be used" as a basis for promotion or graduation^ do the 
competencies reprc^t levels of student proficiency and accomplishment of 
sufficitot importance? 

(4) Does' each competency identify a significant or important skill, in relation to 
the infinite number of sKills which could be chosen? 

Soecificity. 

(1) Are the competencies worded specifically enojigh so that it is dear what aidlls 
are and are not induded in the competencies? / 

(2) Is 'the content domain pacified by the cenp^ency too broad or too narrow? 

# 

(3) Is each competency unique, mutually exclusive, so that extensive overlap does 
not occur among them? 



Taxonomic Levd 

(1) Is the taxonomic level of the competency appropriate to the subject matter of 
the topic and to the grade level? 



Clarity 

(X) Are the competencies free from jargon,, slang, colloquialisms, or other uiusual 
a terms? 

(2) Are the competencies stated dearly and succinctly? 

(3) Are the competencies written so they communicate effectively to students, 
parents, community members, teachers, administrators, and other interested 
individuals? 



RJC 



-45- 



5- 



ERIC 



I 
I 



REPEAT 

pncx:ES8 



V 



FIGURE I* O 
Defining a District Set of School Leaving Competencies — The l^rocess 



SfEPONE 



OROANIIS 

f . MmI Ceotrftiuilot 

2. Itmblitli U9k rote* 



STEP 8M 



SECURE lOAitt) mnovAL* ; 

I. Pr«|Nit« Coin|MlM|^i«a 
ttocumm* > 

]. Rtcoinmaml Polleiat 

9. Dln«mliuil« I|c|knI 



PUOLIC 



OMiNf 1AM rOACi 
1. Coniliwl Oiiattlallm 

S« AAnliilslot Intinimml f 



Vi 

.PUOLiC J 



'STEPflVE 



pREPAiie pmomiv uir 

OP COairEtENCIBt 

i. Ailiiliilsltr iMlnmMt S 

t«riM9nOola 

9« Oovilili GotiMiitw 



r PUBLIC ] ! 

I : »-J I 

' I 

I 

t 



8TE|gl 



IIIIEE 



lOBNripf pfUomtieoMPtiBiCf 
cAiBopnes 

4« fiomit ttata 



o 



r 1 

! PUBLIC I 

I ; I 



SfEPFftm 



OeiBMINi NEEOEO 
jCOMPBfENCIES 

t. Qoln ConaMttui on 
Ptlorlly SulieMagMlsa 

I. talMl CotvprtMiey. 
•lalMavils 

a. Oolaimliia and Fill Qmia 



^ nnlliMi 



I 

t..._j 



f 



/I 



"'^l a. Oavaloii 
I 



EIIAIlt.ltll CUmiCULUM I 

MltCONMIIIEE I 

• •• . I" 

1. Adoiil CoM|iatiiiey | 

I 

OHmt, I 

Cnniialanclaa I 



•lalamania 



* From Performance Indicators for Competency Assessment, Illinois;. State Office of Education, 



53 



Spr 1 ngfleld. hllnols: Author, 197 



igetency 



51 



C295 




MCT 1s a local district option, and the state offers technical assistance 
In the .form of published documents, consultants, and regional centers to 
the I0C&I districts selecting the option. ^ 

As evident In the figure, Illinois has '1dent1f1e(( ^slx basic Steps 
leading to tJie Identification of a set of competencies suitable for gradu- 
ation assessment. The first Involves the selection of a competency coor- - 
xllnator and the members of a task force who will be directly Involved In 
carrying out the process. During the orientation of the task force, com- 
petency areas are ranked In order of Importance to the Individual school 
district. Then In Step 3, competGncy category priorities are established 
and specific competency subcategories are rated. When priority subcate- . 
gorles are established. Initial competency statements are selected and 
others are developed to fill whatever gaps are noted. Each statement Is 
then rated by asking how Important It Is that a student acquire this skill 
before leaving high school. On the ba^ls of the results, a priority list 
of competencies Is established. The last major step In this process Is to 
present the competencies to the district board of education and obtain 
approval for Implementation. It sliould be noted; too, that the system 
allows for public Involvement at a number of points as well as for a 
cyclical process of refinement. 



Ohio 



The Ohio Department of Education and the NWREL both take a more 
question- or issue- oriented approach to establishing a competency program 
and defining the requisite competencies. The Ohio Department of Education 
. recomnends that a task force or advisory committee consider the following 
questions carefully In their competency identification process: 



— What is the purpose of the competency program? 

— What competency areas will be addressed? 

— What grade levels should be used for measurement? 

— Shall there be individualized or uniform competencies? 



The purpose of considering these issues is, in the view of the Department, 
to help provide a framework for administrative decision making. 



C295 



NWREL 

The series of questions that t)ie NWREL considers In Its Guide to v 
Identifying Graduation Competencies (1978) are: ' 



(1) What Is a graduation competency? 

(2) What kinds of knowledge^ skills and attitudes should be Included 
In graduation competencies? 

(3) How can one determine that the coverage of a set of graduation 
competencies is accurate? 

(4) How general or specific should the content of a competency be? . 

' (5) What degree of difficulty should graduation competencies repre- 
ss*? 

(6) Should the same set of graduation competencies be adopted for 
all students? 

(7^-Who should be Involved In drafting and adopting a set of gradua- 
tion competencies? 

(8) What format should be used for stating graduation competencies? 



With this set of questions as a basic framework, the NWREL discusses the 
nature of competencies In a competency- based educational program, their 
potential role In. the educational system, and what their adoption as a 
Graduation requirement can mean in terms of measurement. Instruction, and 
instructional management. . . 



ERIC 



-48- 



5'- 



C295 



References 



Alraslan, P,, Pedulla, J., & Madaus, 6. Policy Issues In minimal compe« 
tency testing and a comparison of ImpTementatlon models * Boston: 
Heuristics, Inc., 1978. — — — " ^"T" 

I ^ 
Arizona, State. Department of Education. Suggested guidelines for the 
development and Implementation of a continuous uniform evaluation 
system . Phoenix, Arizona: Author; 1979. 

American Association of School Administrators. The competency movement: , 
Problems and solutions. Arlington, Virginia: At^thor, 1978. 



Bloom, B. S. Taxonomy of educational objectives: Handboo k L cognitive 
domain. New York: Longmans, I95b. 



Bossone, R. M. Minimum competencies: A national survey . New York: City 
Uni vers 1 ty of New York , Center for Advanced Study 1 n Educat 1 on , 1978 . 

Bunda, M. A., & Sanders, J. (Eds;). Practices and problems I n competency- 
based measurement . NCME, 1979. 

California, State Department of Education. Technical assistance guide . 
Sacramento, California: Author, 1977. 



Candor-Chandler, C. Competency measurement at the local level: A case 
study of Kanawha County Schools, West Virginia. \In R. 8. Ingle, 
M. R. Carroll, & W. 0. Gephart (Eds.), The assessment of studen t 
competence in the public schools . Blooming ton, inaiana: pni uelta 
Kappa, 1978. 

\ 

\, 

Colorado, State Department of Education. A school Improvement 'accounta- 
bility process. PAK #3.1: Writing student objectives . Denver, 
Colorado: Author, 1975. 



-49- 



C2^ 



Illinois, State Office of Education. Performance indicators for competency 
assessment . Springfield, Illinois: mnor,. ly/u. 

Mager, R. F. Preparing instructional objectives . California: Fearon 
____£uW4«4iern-i^. 

National School Public Relations Association. The competency challenge . 
Arlington, Virginia: Author, 1978. 

NebrasKI, State Department of Education. Nebraska— Assessment battery of 
essential learning skills administrative manual trev. ed.j. Lincoln, 
Nebraska: Author, 1977. 

Northwest Regional Educational Laboratory. A guide to Identifyinci hlgh 
school graduation competencies . Portl and, Oregon: Author, 1978. 

Ohio, State Department of Education. Competency handbook . Columbus, 

Ohio: Author, 1978. ^ ^ 

Schooley, D. E., Schultz, D. W., Donovan, D. L., & Lehman, I. J. Quality 
control for evaluation systems based on objective-referenced, tests . 
Paper presented at the annual meeting of tne American Educational 
Research Association, San Francisco, 1976. ^ 



-50- 



ERIC 



C295 



CHAPTER 3 
TEST SELECTION AND DEVELOPMENT 
Michael Priestley 



* 9 

Introduction. 



Purpose 

The entire process of selecting or developing a test Is analoaous to 
distilling salt water: you begin with a barrel of brine, turn It Into, 
steam, filter out the Impurities, condense It, and out the other end comes 
a quart of water pure enough to drink. In selecting or developing a test, 
a similar process takes place: the process, from beglnnlna to end. Is one 
of filtering, refining, and gradually defining the material In specific 
terms so as to produce a test that Is good enough to use. , 

The purpose-.of-thls chapter Is tb^lscuss- Issues and procedures 
related to test selection and development that may be useful to practi- 
tioners who are responsible for planning. Implementing, or reviewing the 
test selection and development components of a minimum competency testing 
program. The site visits conducted In this study of MCT programs, as well 
as an analysis of materials disseminated by the programs visited, revealed 
a set of basic Issues of concern to practitioners In the field, and It Is 
this set of Issues that forms the basis for this chapter. In addition, 
the program personnel who were Interviewed Identified various procedures 
for selecting and/or developing a test that they had found useful. 

Although specific issues and procedures are presented here, these are 
neither exhaustive nor prescriptive. Rather, they represent lists of con- 
cerns and practices cited by program personnel. While, as In distilling 
water, a certain sequence of events is a prerequisite for manufacturing 
pure water, there is still some latitude with respect to both the order of 
events, and within each task, the procedures used to complete it. Simi- 
larly< although the activities discussed below are presented in a certain 
order, a sequence which is based upon that followed by many programs, it 
is not the case that the tasks associated with test selection and develop- 
ment must be sequenced in this ordt". Both the order of the tasks and the 
means used to carry them out are choices to be made on an individual basis, 
taking into account the specific cirdwtnstances of each program. 



-51- 



ERIC 



5*) 



C295 



Content 



The first section of this chapter covers the Issues related to deter- 
mining whether to (1) select a test; (2) develop a tfest; or (3) combine 
the two approaches In some fashion, for example, by itelecting half of the 
test Items and writing the other half specifically to match the Identified 
competencies. These issues represent concerns voiced by personnel In the 
field. The following sections of the chapter describe procedures for each 
of the options that either have been or are being used by existing pro- 
* grams. The final section discusses Issues relating to test vaM^4ty and - 
reliability as Identified by practitioners and measurement experts. When 
apprArlate, examples Involving existing programs will be cited and re le- 
vant Mnaterlals disseminated by programs referenced. 



Context 



In Its basic philosophy, this chapter Is Intended to present a non- • 
judgmental view of the process of select tng and/ or developing a measure- 
ment Instrument to be used In any type of minimum -competency testing 
program. It Is Important to stress at the ..outset that this chapter does 
not advocate any one method or approach over another, but only presents 
Information which will assist educators In making Informed declsdons. As 
much as possible, however, the advantages and disadvantages of each option 
win be noted In an effort to provide a complete and sound basis for a 
decision. Similarly, when testing programs are cited In the course of 
discussion, the only purpose of these citations Is to Illustrate an 1ss^e 
or procedure, or to substantiate a general statement, and not to^ praise or 
criticize any particular program. 

The considerations In selecting and developing tests ^re so numerous 
that a treatment of this size and scope cannot possibly cover all of them. 
This chapter will, however, attempt to present In detail the Issues con- 
sidered most Important by those who have faced them In the field, and to 
' provide lists of resources which may be of use to anyone who wishes to 
investigate further aspects of test selection and development which may 
only be mentioned briefly here. 

With regard to the specific context of this chapter, two basic points 
must be made. First, the reader will note that the chapter deals only 
with the selection and/or development of criterion-referenced tests. This 



-52- 



6n 



C295 



focus follows from the finding that all programs surveyed use criterion- 
referenced Instruments. In one program, a norm-referenced test was altered 
for use as a criterion-referenced test, a procedure that will be described 
later In the chapter. 

Second, an Important point must be made In relation to the three 
standard domains Into which competencies are most often dli^lded: the 
cognitive, affective, and psychomotor domains. This chapter will focus 
primarily on the cognitive ddnaln, which Includes the skills normally 
required for such subjects as reading, writing, mathematics, science, and 
history. The affective domain, which Includes competencies related to 
attitude, personality, and emotional behavior, will not be treated In 
depth. There has been some debate on the Issue of whether a minimum com- 
petency program should properly test In the domain of the affective compe- 
tencies, and testing 'In this domain was far from widespread In the programs 
surveyed for the study. For opposing views, see Wilson (1978) and Ahmann 
(1978*). 

In this chapter, as stated, the emphasis Is on the cognitive domain; 
the affective and psychomotor skills will be discussed briefly In touching 
upon other competencies which might be tested in addition to those compe- 
tencies of which students must actually demonstrate mastery. Testing in 
the affective and psychomotor domains, however, may be useful for diagno- 
sis and remediation, and/or In relation to data analysis of test scores 
for the cognitive domain. 



There are certain specific Issues and procedures which thi? chapter 
cannot explore In depth, because of Its size and scope and because of Its 
stated 1ntent1on'(to provide a basis for making decisions In any minimum 
competency testing program). One of these Issues Is the development of 
tests for special populations such as the physically or emotionally handi- 
capped, or people whose first language is not English. Since It Is not 
possible here to state general guidelines which would be applicable and 
useful in these situations. It is perhaps best to leave this complex 
matter In the hands of local administrators, who are the persons most 
familiar with the needs and requirements of their own special populations. 



-53- 



C295 



Initial Decision; To Select or Develop 



P^allmlnarv Considerations 

Both the experience of practitioners and materials published on this 
topic suggest that program planners consider various questl^s prior to 
deciding Shether to select or develop a test. Brlckell (1978) reviews some 
basic questions concerning competency measurement, wRTTe fneXaTrTornla 
Department of Education In Its technical assistance manual (1978) addresses 
the decisions of whether to select or develop a test. Resolving the fol- 
lowing points may facilitate both the decision-making process and the 
Implementation of the decision: 

p • IDENTIFYING THE PURPOSE OF THE TEST. A minimum competency test 
may be designed for any number of purposes. Including diagnosis, 
screening, evaluation, certification, and appllcatlofi/ advance- 
ment/selection. With the .purpose. of the test clearlSl^flned, 
subsequent decisions related to selecting and developlnti a test 
can promote the effort to match, the test with Its Intended pur- 
pose. 

• IDENTIFYING WHO ;S TO BE TESTED AND WHEN. This step can help a 
testing program administrator to plan the program In response to 
the stated purpose of the test. For example. If a minimum com- 
petency test Is to certify students' mastery of competencies as 
a requirement for graduation, should students be tested at the 
end of the twelfth grade? the eleventh grade? the beginning of 
the ninth grade? These decisions will significantly affect the 
test Itself, the mode of administration, and many other issues.. 

• IDENTIFYING THE DOMAIN OF THE TEST. In the previous chapter it 
wa's pointed out that the choice of domain can have far-reaching 
effects on the nature of the competency test. A domain ^^s used 
here, is the universe of content knowledge and skills which are 
defined by the competencies, and which the test will measure. 
Competencies, as used in this guide, are statements of behaviors 
or skills which the examinee must demonstrate by his or her test 
performance, e.g., "Identify the definition of a vocabulary word 
In the context of a sentence." (Such skill statements are 



C295 

variously called objectives— performance, behavioral • assess- 
ment, or terminal objectives; performance indicators, or perfor- 
mance expectations.) The domain Identified reflects the purpose 
of the test. For example, if the purpose is to certify mastery 
of basic skills (reading, writing, and mathematics) among third- 
. graders, then the domain is defined by competencies that third- 
graders can reasonably be expected to have achieved in these 
subject areas. 



Which competencies are identified and how they are stated will deter- 
mine, in part, what kind of test will be required. If the competencies 
state that a student must demonstrate the ability to write a.coherent 
paragraph, then that much of the test has been determined: all students 
tested will be required to write actual paragraphs. Similarly, if the 
purpose of a test is to certify auto mechanics, ,then one of the competen- 
cies may state that a candidate must be able to change a flat tire. This 
will .then determine that part of the test must require candidates to per- 
form actual functions required of auto mechanics. 



>. I 

Issues to Consider . - 



There are many if,sues to consider in deciding whether to select or 
develop a test. The Ohio Department of Education (1978), for example, 
raises the issues ct timeline and the availability of commercial instru- 
ments that measura the competencies of interest. Five issues in partic- 
ular were of concern to program planners who were responsible for deciding 
whether to select or develop their competency tests. Each one is dis- 
cussed betow, on the assumption that the primary goal is to develop some 
sort of preliminary test specification, a blueprint or descriptive plan of 
the test, from which the initial decision can be made. 

Consequences to examinees . Major decisions made on the basis of 
results from a minimum compe- tency test may determine whether or not a 
student needs remediation, whether or not the student should be promoted 
or graduated, and whether or not a candidate will be certified or licensed 
as a professional (e.g., a firefighter, a teacher, a veterinarian). The 
more serious the consequences to the examinee, then, the more important is 
the reliability of the Information upon which the decision is based. If 
students are to be denied diplomas, for instance, on the basis of a test. 



-55- 



S3 



C295 



that test should certainly be a valid and reliable Instrument. (A later 
section In this, chapter treats tjje technical Issues of validity and reli- 
ability.) - 

Moreover, the more serious the consequences to' examinees, the more 
likely 1s the possibility of a legal challenge. A candidate who has been 
denied a license to practice architecture because of failure on a certifi- 
cation test may have a legal .right to challenge the validity and the reli- 
ability of the tesl*. — - -r— 

The quality of the Instrument Is one b^isls for legal challenae. 
Merle McClung (1977), of the EdMcatlon Commlssfon of the States, formerly 
of the Center for Law and Education. In Cambridge, Masfachu setts. Identi- 
fies several bases for legal challenge, which are suwnarUed below. 



• VALIDITY. A minimum competency test used to make decisions 
regarding the remediation, promotion, or evaluation of students 
must have "currlcular or Instructional validity," i.e.. It must 
test what students have actually been taught. 



BIAS. A test used to make decisions regarding students, job 
candidates, etc., must not have an adverse Impact on any minor- 
ity group (EEX, 1977>; this Includes ^e perpetuation of "prior 
effect" of racial, dlscriminationife.a., tracking. Even a test 
that is proven to have currlcular validitv may cause adverse 
impact; if the»curriculum is biased and the test measures the 
curriculum, then the test may have adverse Impact. 

• PHASE-IN PERIOD. McClung states that a test designed to^measure 
12 years of cumulative knowledge which is implemented in a 
pihase-in period of two years is unfair to students who then have 
only two years to prepare for the test. ,And the decision in a 
recent case in Florida ( Debra P. v. Turlington . 1978) conforms , 
with this view. 



Domain and competencies . The second issue is a praamatic one: the 
domains and competencies identified for testing will Influence the feasi- 
bility of selecting a prepared test. If the competencies require a paper 
and-pencil test to measure reading and mathematics skills', then there are 
many tests available which may be appropriate. The further the domain 
diverges from the basic skills, the more difficult it may be to find a 
test which measures that dopiain and its identified competencies. Few 



.C295 



tests, for example, measure a^student's ability to paint a sl^n, or use a 
voting machine, or analyse the logic In a political debate— all of which 
could be considered minimum competencies. 

Most practitioners will agree that the Ideal test provides a direct 
measure of competencies, but not all programs can acconmodate the costs, 
the time required, and the other problems connected with obtaining the 
best of all possible tests. Possible compromises Include uslnox Indirect 
4iieasures of competencies, ^g.^ multiple-choice test-af wr4ti4»g^ skills 
Instead of an actual writing sample. A competency that requires a itudent 
to write a grammatically correct sehter\pe may be measured by having the 
student, pick from four sentences the one that Is correct, or— more Indi- 
rectly—the one. sentence out of four that is Incorrect. 

Again, validity Is an Important requirements the test selected or 
developed must measure the specified competertcles, which are most often 
based on the curriculum. The question of direct versus Indirect measures 
of those competencies may be relegated to secondary Importance If the 
selected test measures the competencies In some way. Validity between a 
selected test and the competencies varies with the subject area: tests of 
basic skills such as reading and mathematics are most often based on stan- 
dard curriculum and the most basic -competencies. For other subjects such 
as health and physical education, nutrition, speech, social studies, and 
economics. It Is likely to be more difficult to find a published test 

which measures the competencies Identified In one of thesg areas because 

of the lack of agreement as to what skills constitute the basic competen- 
cies In these subjects. A developed test, on the other hand, can be con- 
structed specifically to measure the competencies Identified In almost any 
area. . 



Timeline . Developing a test usually requires more time than select- 
ing one. the National Assessment of Educational Progress (NAEP), for 
example, spends about two years in the development of one of its tests 
(AASA, 1978). The time devoted to selection or development will vary 
depending upon a number of factors. Administrators and teachers in South 
Burlington, Vermont prepared tests to measure all of the state competen- 
cies during the summer of 1977 and revised the Instruments the following 
year. This, was made possible by the state-mandated schedule of implemen- 
tation. 

In some cases, however, legislative mandate has required the imple- 
mentation of testing programs within only a few months. In New Jersey, 
for example, the mandate called for immediate implementation of minimum 
competency testing. Although Minimum Basic Skills tests were to be devel- 
oped, given the schedule of implementation, the tests developed for the 



ERIC 



-57- 



C295 



statewide assessment program were used during the first year. These 
Instruments were replaced the following year by the newly developed Mini- 
mum ^aslc Skills tests. . 

A shorter time Ijs usually reqijlred to select a test than to develop , 
one because the process Is much simpler: a selected test requ1reSi.no Item 
writing, for Instance, and often requires only limited field-testing. If ^\ 
any, before actual administration. , 

^ Resources . The cost of developing a test Is generally greater than 
the cost of selecting a test, since test development Involves not only 
out-of-pocket expenses but also the cost of staff time. The amourtt of 
staff time allocated varies according to who actually does the develop- 
mental work, such as writing Items. Staff may be employed only 'as project 
monitors to coordinate the volunteer efforts of teachers, or they may be 
responsible for developing the entire test In-house. In the latter case. 
If permanent full-time staff are used on a particular project, additional 
personnel may be needed to assume the responsibilities of the regular 
staff for the duration of the project. 

The cost of administering and scoring tests may n6t vary signifi- 
cantly between the selected test and the developed te^t, but* admin Istra- ' 
tlon costs vary with the program design. If a test tan be administered by 
local teachers, for example, this is less expensive than establishing a 
special team of people to administer all tests. The latter procedure Is 
used In many programs to Improve test security and to standardize condi- 
tions for test-administration, as In New Jersey where the administration 
of basic skills tests Is supervised by county test coordinators. 

In programs In which part or all of the project work is awarded to a 
consultant, the cost of the consultant must be considered. Cost^asso- 
dated with consultant contracts are generally higher for test development 
than for test selection because of the likelihood that more expertise will 
be required in developing a test than for advising on the selection and 
use of a pub.l1shed test. v 

Avail ability of technical .expertise . Expertise is usually required 
both for selecting and developing tests, although to different degrees. 
Types of expertise required most often Involve psychometric Issues such as 
test reliability, validity, scoring, and «data analysis; but specialized 
• knowledge may also be fteeded in such areas as curriculum, subject matter, 
and readability. 



4> 



ERIC 



-58- 



C295 



To select a test» technjcal expertise will be required* In assessing 
the quality ofy test Instruments, I.e,., their validity, reliability, and 
lack of bias. This expertise will be required to develop criteria for 
screening tests and to develop procedures for applying these criteria 
systematically to the potential tests. 



To develop a test, technical expertise will be required in order to 
write or select, review, and edit test items, as well as to. sequence the . 
' items and to actually construct the test. f 

Whether the decision is to select or develop a test, expert knowledge 
will be advantageous in planning and Implementing the overall design of 
the testing program and of the test, since someone familiar with these 
processes will be aware of the consequences and ramifications of each step 
in the process. For example, a knowledge of how the tes^ will be scored, 
• e.g., by specific subtests or by competencies, is essential in determining 
the number of items that must be developed for the test. Wljether the 
scoring provided by the test publisher (if it is provided) will s^1t the 
needs of the program, i.e.^ tell you what you need to know, may also 
Influence the decision to develop or select. 

Aihof these issues are ones to consider in deciding whether to select 
or develop a test. Once these issues 4iave been resolved, a picture of what 
the ultimate test may look like will begin to form. 



At this stage, program planners may want to consider preparing teat 
sveoifioationa , a kind qf blueprint which serves as an ideal description 
of the final test. Test specifications are useful at this point because 
they can bring out issues and concerns^that may not have been considered. 
For example, if a test is selected or developed this year and administered 
to ninth-graders, should the same test be used next year? Or must parallel 
or equivalent forms of the test be developed now for later use? Obviously, 
the answers to these questions will affect every aspect of the test design, 
down to ^e choice of distractors in each test item. . / 

In the programs surveyed, the. test Specifications were generally 
developed as part of a total program design to ensure that the .test which 
was selected or developed would meet the needs and purpose of the testing 
program. Careful consideration can then be given to test length, the 
number of competencies measured, and the numbers of items requ/ired to 



Prelimin&ry Test Specifications 



-59- 




ERIC 



^95 



.yield thu types of scores desired <e..g., by subtest or by competency). 
These decisions must be made In relation to all of the factors d1$caj«ed 
.previously, but particularly In relation to the purpose of the progrero 
(what kinds of test scores are needed?); cost (how much more will a . 
100-ltem test cost than a 50-1tem test? how much more will It cost to 
score open-ended Items or writing 'samples than multiple-choice questions?); 
and time, In relation to both the time required to Implement the program 
^ ih« actual time fo^ admtnisteriflf the test (how much <if the student *s 
and teacher *s^1me should N spent on this test?)^ 

i ' ■ * ' ' ■ .* 

This rough form, of the test specifications, then, usually Includes: 
(1) an estimate >f how long the test should be; (2) how many competencies 
will be measured and by how many ftems; (3) what kinds of Items will be 
used; and (4) what kinds of scores the test will generate. After this the 
decision of whether to select or develop the test may become clearer, as 
the* outline of the test emerges from these specifications. This prelimi- 
nary test design may have to-be modified later, particularly if the deci- 
sion 'is in favor of test development; but this step offers a glimpse of 
the chimera beta§»stalked in the search for the right test. 



Types of Tests ^ . ' ' ^ ^ 

The final issue \i consider independently before jnaking the decision 
" is what type'of test to use, 'i.e., given specific competencies, hoW will 

* these •'bfe measured? Program planners can turn to various programs and 
-materials for examples of different. kinds of assessments. The California 

Department of Education discusses various modes of assessment, including 
perl^ormance-based 'testing, in its resource materials prepared for local 
districts (1978). 

Three general, approaches to testing are available:'^ performance 
tests, observational tests, and'^paper-and-penciUtests'. A performance 
test can be defined as 6ne "that measures performance on tasks reqMiring 
the application of learning in an actual or simulated setting" (CAPT 
Newsletter, 1978). This type of, test is the most direct method of assess,- 
inent in a number of situations: when testing competencies that require 
physical manipulation^ such as using a telephone in ah emergency situation 
(e.g., ConVal, NH), building with blocks, performing calisthenics, or 
adjusting a carburetor; .or when testiny compisitencies that require "on-the- 
job" situations involving social interaction, such as s.ales techniques, 
giving and following oral directions, or making introductions (e.g.. South 

• Burlington, VT). ♦ • 



C295 



Observational tests measure competencies which Involve behaviors or 
skills that can only be assessed through observation, e.g., teaching 
skills, or behaviors in a socfaT se^tlna. This approach to testing Is not 
entirely divorced from performance testing (bbvlously. If an examinee -Is . 
performing, someone else must be observing* In order -to evaluate the per- 
formance), but observational .testing also has unique characteristics In 
that It can be conducted In real or simulated situations. Observational 
tests are especially useful }n situations which are not contrived Inten- 
tionally for the purpose' of testing, e.g., observing preschool children . 
.for such .competencies as attentlveness, ohseyving rules of conduct, and 
Initiating conversation with peers. 

Paper-and-pencll tests are by the far the most common tests In use; 

'these Include rrtultlple-chol tests, tests made up of open-ended Items 
(such as flll-ln-the-blank Items), and essav-type tests (which .Include 
writing samples, design problems, and so on). Brickell (1978) contends 

♦that paper-and-pencll tests can be considered perforpiance tests in a 
school situation because taking a paper-and-pencll test Is an actual per- 
formance in school. These tests essentially measure the application of 
knowledge and skills. 

Different kinds bf tests generate different kinds of results, so the 
choice of what type of test to select or develop is a very Important one. 
To some extent, the type of test required is determined by the competen- 
cies to be measured. For example, the most direct assessment of a compe- 
tency that is stated as "Describe the four basic food groups" would be an 
oral or written description. Each type of test is Ideally suited to 
particular competencies. One type of test may be chosen to measure all 
competencies, with some measured more directly than others. Or some com- 
bination of the types of tests may be chosen: a performance test for life 
skills, such as comparison shopping or using the library, and a paper-and- 
pencll test for the jschool skills. The state of Hawaii, f?;; example, has 
developed a battery of tests for third-graders to measure 100 different 
competencies (termed performance expectations). The test battery, used 
for screening and then for diagnosis, includes hundreds of test items of 
different types: performance items for physical exercises, oral-response 
•items, verification checklists and rating scales, and several types of 
paper-and-pencil items. Hawaii's educators have chosen to use the most 
direct method of assessment in every case possible, on a competency- by- 
competency basis. 

In general, cost and objectivity favor the use of paper-and-poj cil 
tests, but relevance and face validity may favor the performance test 
(Brickell, 1978; Mehrens, 1978). The further one goes from performance 
tests and the closer to paper-and-pencil tests, in general, the less 
• expensive testing becomes. Also, the closer to paper-and-pencil tests, 
the larger is the number of the tests one will have to choose from. 



\ 



ERIC 



-61- 



Making the Decision . , • ^ 

This Is the point at which all of th^ factors, Issues', and considera- 
tions discussed can be weighed, one against the other. The relative 
Importance of cost versus direct assessment, of timeline versus leglsla-. 
tlve mandate, of test validity versus available resources, and so on, can 
be determined. with respect to the specific program under development. The 
results of all the preliminary analyses of these Issues and considerations 
can be helpful In making the decision. 

The following* two sections of this chapter describe actual procedures 
generally followed In selecting or developing a test, or to achieve a com- 
bination of the two. The f trst of the two sections deals with test se,lec- 
tlon, the second with test development. Following these sections, the 
discussion returns to Issues which affect both developed and selected- 
tests, such as fleld-testli.g, technical and legal Issues of validity and 
reliability, and test administration. 



Test Selection 



Programs choosing to select a test (e.^-.r-North Carolina) typically 
carry out a number of procedures prior t^ that selection. These Include 
considering the test domain and sources Of possible tests, developing 
criteria for selection,, identifying potential Instruments, and applying 
the selection criteria in order to arrlvi at a decision. In considering 
selection criteria, program planners may elect to use criteria that nave 
already been developed and used, such as ithe MEAN System developed by the 
Center for the Study of Evaluation (CSE). Developing or choosing criteria 
for selection as well as the other procedi;*res will be discussed later in 
this section. 



ERIC 



-62- 



n 



C29S 



9 

4 



Considering Test Domain and Sources of Te ? 

Domain , The\key element here is congruence, i.e., the relationship 
between what the test purports to measure and the competencies that have 
been identified for 'testing. Madaus et al. (1979) state that congruence 
is a function of two considerations: (I) the number of competencies meas- 
ured by the test, and (2) the number of items measuring each competency. 
An initial review to identify tests which measure both the broad compe- 
tency areas (e.g., reading, mathematics) and the specific competencies, 
with an appropriate number of items measuring each competency, may narrow 
• down the number of tests to consider as potential candidates. 

Program planners may encounter difficulty in finding a test which 
measures exactly those competencies identified for testing in a particular 
program; a reasonable approach, therefore, is to seek the test(s) measur- 
Hqg jthe largest percentage of those competencies. 

With regard to the number of items required per competency. Berk 
(1979) states that the number varies in relation to four essential factors: 

(1) importance and type of decisions to be made on the basis of results; 

(2) relative importance assigned fo the competencies; (3) the number of 
competencies; and (4) practical constraints. Berk recommends that 5-10 
items per competency be used for most classroom decisions and 10-20 items 
be used for school, system, and state-level decisions. More items per 
competency will be required for scoring by competency— i.e., determining 
pass/fail or mastery for each skill— than for scoring by subtest or total 
test score. Fewer items may suffice in certain situations, as in a test 
for which there is to be only one total score, or one score for each of 
two or three subareas; then the total number of Items on the test or in 
each subarea outweighs the importance of the number cf test items per com- 
petency. The number of items must be considered carefully in relation to 
the criteria listed above to ensure selection of a valid and reliable test. 

Sources of tests . Sources from which instruments may be selected 
Include normative-referenced and criterion-referenced tests, with corres- 
ponding competencies specified, and item pools or banks, large sets of 
' items from which appropriate measures may be selected. (For a list of 
test sources, see Appendix A.) 

As soon as potential "candidates" for use in the testing program have 
been Identified, planners may write to the publishers of the tests they 
wish to acquire ^^nd request copies of the test itself, answer keys, tech- 
nical manuals, and any other information that may be helpful. In addition, 
planners may wish to follow the example of the Massachusetts Committee on 



-63- 



ERIC 



C29S 



Basic Skills Improvement Policy (Madaus et al.. 1979), and ask the test 
publishers to Identify those Items which. In their opinion, measure the 
competencies already Identified for the program. 



Developing Criteria for Selection 



This task usually involves working with staff and other persons 
(e.g., teachers, parents, students, business representatives, members of 
the conmunlty, legislators) to develop a comprehensive list of criteria by 
which to judge the available tests. The particular selection criteria and 
the method of review that Is chosen or developed will vary according to 
program needs. 

A number of methods for reviewing tests for local, state, and national 
programs are discussed In the program and research literature on this sub- 
ject (see Appendix A). In Massachusetts, for example, the Department of 
Education contracted with the Public Affairs Research Institute to develop 
both criteria for screening commercial tests and a system ^or applying 
these criteria. (Madaus et al., 1979). The Institute Identified criteria 
relating both to the content of the test and ^ Its technical Pjopevjlf?-. ^^® 
MEAN System developed by the Center for the Study of Evaluation (CSE) Is 
another example of possible selection criteria. The acronym stands for the 
four characteristics a test Is rated on: measurement validity, examinee 
appropriateness, administrative usability, and named technical excellence. 
For a discussion of how the CSE staff applied this system, see CSE (1976). 

Criteria to consider Include not ohly technical (e.g., validity, bias) 
and content-related (e.g., accuracy, difficulty) Issues, but also practical 
features such as cost, availability of tests, and the administration of 
test Instruments. To facilitate this step, a comprehensive set of review 
criteria can be developed to match the needs of the specific program. 



Identifying Potential Instruments 



This simply consists of weeding out those tests which are obviously 
Irrelevant or have obvious flaws. For example, tests of personal attitudes 
or civic responsibilities are Irrelevant if the identified competencies 



-64- 



C295 



cover only reading and mathematics. Also, a test of mathematics which 
requires only logical thinking and no computation may be considered Inade- 
quate if computation Is a specified competency. 



ApDlylnq Selection Criteria 

Once the potential Instruments have been collected, the people ' 
appointed to review the tests can do so by applying whatever selection 
criteria have been developed or chosen. There are a number of ways In 
which this stiBp may be accomplished. Most of these procedures Involve . 
the use of rating scales or checklists to quantify data from the review 
process. The planner may wish to consult the test evaluations conducted 
by CSE using the MEAN system. The evaluations of commerlcal tests for all 
levels are listed In Appendix A. 

It Is Important to note that In situations which involve lay people 
as revlewers-e.g., parents or legislators who may not be at all ^amjllar 
with testing programs— program personnel have found It generally advisable 
to train these people before the actual review process begins. In order to 
guarantee the Internal consistency across reviewers which Is essential to 
the review process. 

If there Is more than one committee, then different selection crite- 
ria may be developed for each committee. Or different people may rate the 
tests on the basis of one portion of a complete set of criteria: for 
example. In Massachusetts (Madaus et al., 1979) the comm ttee was composed 
of technical experts who reviewed the tests for technical criteria, and 
other members, primarily teachers, who reviewed the tests on the basis of 
content criteria. Although practical issues were of secondary I'^PO'^tance 
in Massachusetts because the tests selected were not mandated, but only 
approved for use, a different set of people might be appointed to review 
the tests only in terms of practical concerns. Ultimately, the results 
from all of these separate reviews will be aggregated. 

Given a committee of people appointed to review the potential tests 
(preferably the same people who developed the selection criteria, who will 
thus be familiar with all aspects of the program), there are at least 
three approaches to completing the review. One approach is to have every- 
one rate each test Independently and compile the results through a totally 
objective method, e.g., keypunching and computer analysis. A second 
approach, which may be favored by people who wish to feel personally 
Involved in a group process, is to have all the reviewers evaluate each 



-65- 



C295 



test simultaneously as a group, keeping one record of the consensus on 
each test. A useful compromise between these approaches Is to have the 
reviewers rate the tests Independently, and then meet as a group to dis- 
cuss their ratings and reach a committee consensus. 

The advance preparation of rating scales or checklists which are easy 
to read, understand, and use Is well worth the time and trouble required. 
The Competency Handbook (Ohio SDE, 1978) provides a number of models which 
1n$ilude a checklist of purposes for measuring competencies, a 1 1st of 
'criteria for test selection, a test nomination form, a test selection 
information form*, a test comparison grid, and a rating scale for determin- 
ing the relative Importance of all the selection criteria. 

A sample section of a rating scale from the Ohio Competency Handbook 
(1978) is included below. 



-66- 



ERIC • ^ 



C295 



Directions: 



INSTRUMENT SELECTION CRITERIA 

Each. committee member should sort the criteria into one 
of three categories. "H" is highest priority or most 
important. "M" is medium priority and "L" is lowest 
priority. The entire committee should then adopt a 
consensus list of criteria. IT IS MOST IMPORTANT TO 
CONSIDER EACH OF THESE ITEMS IN TERMS OF THE PURPOSES 
SELECTED EARLIER. ' 



Circle One: 
H M 

H M 

H M 

H M 



H M 



1. Cost per student including materials and desired 
scoring services. 

2. Total amount of time necessary for test acftninis-, 
tration. 

3. Ease of administration (e.g., can be given by 
teachers. 

4. Recent appropriate norms (i.e., for different 
times of year and for groups of -students similar 
to yours). 

5. High reliability and validity for the purposes in 
local testing program. 




Careful planning and preparation can keep the difficulties In reach- 
ing consensus to a minimum. The more explicit the selection criteria and 
the more practical and efficient the review procedures, the easier it will 
probably be to reach agreement as to which tests are most appropriate for 
use in the program. 

After each reviewer has rated each potential test on the basis of the 
set criteria, and/or the group has reached a consensus on each test, the 
results can be compiled and analyzed. 



-67- 



ERIC 



C295 



Selecting the Instrument 



For this final step. It be helpful to rank-order the tests whlth 
have received the highest ratings.. This H(«tll be easy^o do If the rating 
process yields a sunmatlve score for each test: for example, the MEAN 
system, which yields a four-letter rating of each test ("Good" ratings on 
each of the four criteria used In this system would be recorded as "6666"). 
Other procedures may yield numerical scores if a certain number of points 
are awarded for the adequacy of the test In relation to specific criteria. 

When the top tests have been rank-ordered, there may be some with 
equal or very similar ratings. In this case, the group can return to the 
Individual ratings and weigh the pros and cons of the results of the review 
in order to eventually reach consensus on which test Is most appropriate 
for the testing program. If after rank-ordering the potential tests there 
Is one test rated high enough that 1t stands head and shoulders above the 
.crowd, then the job is done: a test has been ^elected. 



Test Development 



In the programs surveyed test development typically proceeded 1n one 
of three ways: (1) by constructing 1t from the ground up, which Includes 
writing all of the Items; (2) by selecting the Items from Item pools or 
from other tests, or by modifying an existing test; or (3) by using a com- 
bination of the two methods by writing some Items and selecting others. 
Regard'less of which approach 1s taken, procedures that can help to ensure 
that the test meets Its Intended purpose are similar. These are: 



— Identifying personnel to develop tests; 

— developing test specifications; 

— developing item specifications; 

— writing/selecting items; 

— reviewing and editing items; 

— field-testing the instrument; 

— conducting validity review; 

— modifying the test, if necessary. 



Each will be considered in turn. 



-68- 



C295 



Identifying Personnel to Develop Tests 

Depending on the scope of the test devjelopment project, a nuwber of 
personnel may be needed^ complete the process. These include Item 
writers, editors and reviewers, test administrators, project coordinators, 
content experts, and technical expert? to assist In designing, scoring, 
analyzing, and reporting the results of the test. Whether personnel with 
these skills are to come from local school districts, department staff, or 
consulting agencies will depend on the Individual situation; advantages 
and disadvantages exist for each posstblllty. 

Teachers may be called upon to perform many of the developmental 
tasks, particularly test Item development, validity review, and test 
administration. In relation to the task of Item development, programs 
have found both advantages and disadvantages to using local teachers. 
According to Miller (1979), many teachers' have had little or no training 
In evaluation and find It difficult to develop good test Items. On the 
other hand, teachers have a vested Interest In developing a test that may 
affect their schools: they may make up for lack of expertise with their 
enthusiasm and willingness to learn. Training teachers to develop test 
Items can be beneficial to students, to school systems, and to the teach- 
ers themselves; It may also save money If teachers are willing to contri- 
bute their time and energy In return for training and experience. A test 
developed by local personnel Is more likely to receive strong support 1n 
the schools than a test developed elsewhere. 

In Peterborough, New Hampshire, administrators hired two district 
teachers and sent them to Educational Testing Service (ETS) to learn how 
ta write and edit test Items. Although prior to this experience they had 
no specialized training In this area, the teachers assisted In the devel- 
opment of the competencies and assessment, and now administer and score 
competency assessments at all grade levels. By comparison. South 
Burlington administrators chose to develop all Instruments In-house and 
provided training In a summer workshop for Interested teachers. Those 
participating received credit toward advancement on the district's salary 
schedule. 

Another option for administrators interested In providing staff 
members with special knowledge In test development Is to bring In consul- 
tants who will train district staff. In Gary, Indiana, for example, con- 
sultants from ETS taught teachers how to score essay tests hollstlcally. 

Test development experts are often more helpful In tho€e areas which 
require technical expertise, such as designing the test and analyzing the 
results— expertise which is less likely to be available In the school 



-69- 



ERIC 



C29S 



districts or in a department. The budget, however, may determine whether 
consultants are employed In a test development project. Some compromises 
may be possible; districts may opt to develop some tests on their own and 
contract for others^ Although Gary, Indiana developed an oral proficiency 
test and hired consultants to teach the scoring of an essay test, the 
reading test was developed by Westinghouse Learning Corporation, which 
selected items to match the competencies. 

Finally, the program staff generally assumes the responsibility 
for monitpring and coordinating personnel, and completing the activities 
Involved in the project. 



Developing Test Specif icationfs 

As mentioned in an earlier section, a blueprint or set of specifica- 
tions for the tesfe is helpful before construction of the test begins^ A 
test developer who builds a test without a preliminary design faces the 
same risk as an architect who begins construction of a house before drawing 
the blueprint: the structure may collapse. For examples of test specifi- 
cations, see California's Technical Ass istance Guide (1978), Appendix B. 
Sample test specifications from materials prepared oy the Beaumont Unified 
School District are included in this document. 

When designing the test, an important consideration is the domain 
covered by the test. Each subarea of the domain is usually to be repre- 
sented on the test in proportion to its importance in the domain. For 
example, a possible domain may be mathematics, which is to be measured, in 
a one-hour test. Then within the domain there may be subareas such as 
mathematical computation, number concepts, geometry and measurement, and 
problem solving. When the relative importance of each subarea within the 
domain has been determined (by public survey, job analysis^ committee 
consensus, etc.), each subarea is represented proportionally on the test. 

On the next page is a sample chart that may be useful in constructing 
test specifications; the numbers and competencies have been devised to 
describe a mathematics test of 60 items in length. Note that in this 
hypothetical domain there are four subareas, each of which is tested by a 
number of items proportionate to its predetermined relative importance. 
Within a subarea, also, the importance of each competency has been deter- 
mined, and assigned a number of items needed to measure it. 



ERIC 



-7^- 



C295 

Y < 



DOMAIN: MATHEMATICS 



Subarea 


% Of 

Domai n . 


V Number of 
^ Items/ 
Subarea 


Competency 


Number of 
Items/ 
Competency 


I • Computati on 


~ — r 

20% \ 

> - 1 


12 


1. 


4 








a 


9 

II. Concepts 

ft 

f 


30%. 


18 


3. 

4. 

e 

P» 


6 

6 


.1" 

ill* beoine try ot 
Measurement 




6 


6. 
7. 


2 
4 


IV. Problem 
Solving 

• 


'40% 


24 


8. 
9. 
10. 


6 

'8 
10 


TOTALS 


100% 


60 




60 



-71- 



/ 



C295 



^ One purpose of drawing up test specifications Is to obviate problems 
that might arise 1n the future. The t>pes of scores expected and the 
decisions to be made on the basis of test^results will help to define the 
teSt specifications. Wfthin the^ar eas of a par ticular test, the number 
of items matched to each competency must also be determined. In general, 
the\larger the number of Items, the greater the reliability of the test 
results and the greater confidence one can place In evaluation decisions 
XBBrii 1979). Berk recommends 5-10 items per competency for classroom- 
level; decisions and 10-20 items for school-, system-, and state-level 
decisions, and this was generally observed in the field. y 

-When the test specifications have been completed, each subsequent 
. step 1n\the development process can readily foUoW from these specifica- 
tions. 



Develop1n4^Item Specifications 

Test specifications are to tests as item specifications are to items: 
they help you plan in advance to determine what the items will look like. 
As Dahl (1971) and Rovinelli and Hambleton (1976) point put, the most 
important requirement in the construction of a criterion-referenced test 
is establishing a direct relationship between each item on the test and 
the competency it purports to measure. Translating competencies into 
items is an essential step in establishing the validity of the test, and a 
carefully designed framework for this process can significantly improve 
chances of success (Priestley & Nassif, 1979). Ilfem specifications can be 
of tremendous value in this process because they can determine exactly 
what types of items must be written or selected to measure each competency. 
The' process of y*evieWing these specifications may also, as in New Jersey, 
for one, serve ^o promote confidence that an appropriate and useful instru- 
ment is being constructed. 



Generally, specifications for selected-response items include most or 
all of the following characteristics: a statement of the competency, a 
sample item, stem\ attributes (how the question is to be presented), and 
response attributes (how the distractors are to, be constructed). They may 
also include stimulus attributes (description of stimulus material the 
item requires, e.g., the length and difficulty level of a reading passage) 
and a description of the content domain (e.g., what subjects can be tested 
across a set of items matched to the competency). / 



-72- 



ERIC ^ 



» 



■I 
I 



C295 



For construeted-response Items such as essay tests or oral-response 
questions, the specifications usually Include a description of the testing 
.situation (e.g., "The student will listen to questions pn a paced-tape and 
record his or her responses"), and a set of scoring criteria to determine 
v«hether or not a response Is adequate. / /' 

The Ideal situation Is to have the same people who first developed 
the Item specifications then write or select the Iteips. Through discus- 
sion, close examination of the competencies, and a^empts to define them 
more clearly, the specification developers can ac<u1re a detailed know- 
ledge of just what each competency entails and how It should be measured. 
Once the type of measurement and Its characteristics have been defined. 
Item development may proceed. 

The development of Item specifications, as stated ab^ve. Is an 
optional step; many test development procedures go directly from competen- 
cies to Items without It. Whether or not this extra step Is laken wIVl 
depend upon factors considered In this section, particularly those related 
to test validity. If the validity of a test Is. paramount, then Item spec- 
ifications will help to ensure that the process of Item development will 
generate a valid test. Item specifications may also save considerable time 
and money by averting a situation In which the first administration of a 
completed test reveals that the test does not measure what It was Intended 
to. 



Writing/Selecting Items 

To construct a test, items may all be written, may all be selected, 
or may b^ generated through a combined approach. 

If the test items are selected, two considerations arise: (1) avail- 
ability of sources from which items can be selected, and (2) how these 
items will be matched to the competencies already Identified. Items can 
be selected from item pools or from published tests-usually tests in the 
public domain. Complete item pools. Or item banks, do not yet exist, 
although they are under development In many areas (AASAi 1978). In some 
cases, local school districts and consortia hav.e pooled their resources to 
develop Item banks, and some states (e.g., California, New York) are in 
the process of developing pools of items for use by their local districts. 
Corrmercially developed item pools are also available. (See Appendix A for 
sources for items and tests.) 

) 



-73- 



9 



ERIC 



ft * 

V. C295* 



Selecting Items from published. tests,, which Florida chose to do to 
obtain somtf of Its Items, may Incur ^additional costs for permission to^use 
the Items (somi publishers charge a standard rental fee per Item, per 
adnilnlstratlon), or It may simply require a request for permission to 
'reprint copyrighted materials and an acknowledgement to the publisher 
cited in the test booklet. Hje fact that tests In the public dbmain do 
not require permission for reprinting may be a trade-off for 'lower qual- 
ity; iteJps are released after they have been used, and may sometimes be 
outdated. . / 

The second important factor in selecting items is establishing con- * 
gruence between what the items measure and what the competencies intend 
to measure, I.e., matching items to competencies. The most widely, used 
• approach to this task is ^ review by a committee of qualified persons, . 
often teachers and evaluators. The first step is to define the exact . _ 
Intent of each competency; next, to identify the content or skill measured 
by each item reviewed; last, to match the items.. to the competencies. This 
review can"be performed on a group or Individual basis, in much the same 
way as the review for selecting pi^xttshed teits. 

Although this approag) sounds simple, it presents some difficulties ^ 
jn that criteria other thw item/ competency match must be applied. For 
example, item bias is a concern; ensuring .that the set of items matched to , 
one competency covers a representative sample of skills or knowledge in . . 
the' domain defined by the competently can be difficult; and equating diffi- 
culty levjels of items measuring a sing-le competency can be a problem. • 
(For a more comprehensive list of .crUeria for item review, see "Reviewing 
. Itfems.") . * ' . : * • 

# ■ * ■ ' * 

^If the decision is to wr1fe- the items required to construct a test, 
an Important consideration again is 'item/competency match. The uSe of . a 
item specifications 1s one effective way to help ensure that valid items 
will b^ produced. The likelihood of producing valid items Will be great^^r 
If the item wr1' ; are qualified content specialists with demonstrated 
. .••m'inlmum compet...cy" 1n writing skills, atnd 1f they have been carefully 
and systematically trained by a professional experienced 1n item writing, 
editing, and item writer training. Those with little or no experience 1^ - 
'item writing may need a practical introduction to evaluation concepts - 
before they begin writinFg. Also, if the team of writers is a representa- 
tive sample with respect to ethnic background, sex, and cuV: jral pen|pec- 
tive— Insofar as tljis is feasible— this may decrease the possibility of 
bias -in items across the test. (For sources of information on how to 
write \est items, see Appendix A.) 

Other procedures which may be followed to ensure the validity and 
quality of test items take place after the -items have been written, at 
least in first draft form. These procedures are described in the next 
four subsections. 

• • 'x 

t 

-74- 



ERIC 



• C295 



Reviewing and Edit'lnq Test Items 



Although the entirt set of Items Is likely to be reviewed many times 
In the course of test development, and It Is useful to have as many people 
as possible participate In the review process, for the very first review 
It may be more desirable to submit the Items to a two-member team consist- 
ing of a qualified reviewer and a qualified editor" who can examine each 
item for content and gfairraar, respectively, as well as for the quality of 
the item as a measurement device. The team is likely to be most effective 
if both reviewer and editor have been trained in certain aspects of psycho- 
metrics and test development. 

Items are generally reviewed on the basis of many criteria. Some of 
these criteria, compiled from a number of programs, are listed in Table 1, 
and may be used for reviewing all types of items, either written or 
selected. For a listing of some additional 'guidelines for reviewing and 
editing, see CaTTfornia (1978), Appendix C. For the purposes of clarity, 
item-related terms used below are defined as follows.: 



DIRECTIONS: * Instructions used to orient examinees to the item 

format. How to answer the questlon(s). 



STIMULUS: A reading passage, picture, chart, ate, that 

Includes Information necessary to -the item. 



STEM: The main body of the item which states any necessary 

facts and asks the actual question. 



ALTERNATIVES: Possible answers to choose from, which often include 

a correct response and one to four distractors. 



DISTRACTOR: An incorrect response in a set of alternativDS or 

possible answers. 

Once the test items have been reviewed and edited, they may still 
need revision by the item «r1ter, or they may have to be replaced by new 
items. When a set of acceptable items has been procteced, they can then 
be reviewed by a committee of qualified persons who did not participate 
in the writing process. The most widely used approach is to select a com- 
mittee that comprises teachers, one or more members of the departmental 



-75- 



•ERiC ' 



V I 

■ \ 



TABLE 1 



Item Re*/}8W Criteria 



Is the item closely matched to the competency, i.e., does 
the item measure knowledge or a skill Within the dpraain of 
the competency? 



Is the knowlv<id^e or skill measured by the item a significant 
aspect of the domain (which may contain an almost infinite 
number of possibilities)? 

Is the format of the item suited to the skill or knowledge it 
is intended to measure? 

N 



Could the item be more difficult to one group than to 
another because of an unstated assumption or esoteric 
wording? Is the item biased in terms of sex, race, age, 
culture, religion, or region? Does it contain a stereor type? 

(NOTE: Bias is a multi-level criterion, i.e., each item can 
be reviewed individually and as part of the entire set of 
items. One item which presents a woman as a secretary or a 
Chii ^ man working in a laundry is not necessarily biased; 
if similarly stereotyped situations occur in many items, then 
the test as a whole may be biased.) 



Could the item be offensive to a member of any ethnic 
group? 



-76- 



\ 

1 



C295 



Accuracy 



TABLE 1 (continued) 



Is the item grammatically correct? 

— Is there only one correct response? 

« 

Are the stem and alternatives clearly stated and unambig- 
uous? 

— Are there structural clues to the correct response? 

— Are all distractors plausible but still incorrect? 



*v 



Difficulty 



Is the readabUity level of the item appropriate to the grade 
level? ^ 

Is the level of difficulty of the skill or knowlec^e required 
by the item appropriate for the designatipd grade level? 



Are all items matched to each competency geared at about 
the same level of difficulty? 




-77- 



C295 



TABLE 1 (continued) 



Interest Level 



— Will the item-related material, e.g., reading passage, be 
interesting to examinees? 

— Is there enough variability in approach and content across 
. items, to the extent that variability is possible, to make the 

test interesting? 



Practical Considerations 



Is the item simply too big and unwieldy to be included, e.g., 
an item which requires a student to choose one of four maps 
when a question about one map would suffice? 



— Is the format of the item clear and understandable? 

— Are the directions clear, concise, and unambiguous? 



-78- 



FRir 



C295 



staff, and perhaps parents, student representatives, members of the busi- 
ness cormiunlty, legislators, etc. The selection of this group may depend 
on the Importance of the test and Its consequences, the amount of time and 
' money available, and the nature of the test Itself (I.e., the subject 
matter, as In a reading test versus a test 1n engineering). As large and 
representative a cross-section of people as possible, given project con- 
straints,.. Is desirable. 

It may be necessary to orient committee members to the testing pro- 
gram and train them In how to Interpret the competencies and review test 
Items. Results from a cownlttee consensus on each Item written will often 
determine the amount of revision necessary before field testing. 



Field-Testing the Instrument 

Program personnel Identified during the site visits two procedures as 
being particularly useful In developing a competency-based instrument: 
(1) field-testing the Instrument with a representative sample of the popu- 
lation for whom the test is Intended, and (2) examining the test and Its 
results to determine the effectiveness of the test In measuring what It 
purports to measure. The first procedure— field-testing— Is considered 
here. The second Involves determining validity and reliability, 
which Is discussed In the next section. 

The purposes of a field test usually Include one or more of the 
following: (1) to refine the test Items; (2) to Identify "bad" Items, 
1«e., Items which do not yield the kinds of results expected; (3) to 
obtain baseline data for assessment; (4) to obtain data for designing the 
final test(s), e.g., data from which to construct parallel or equivalent 
test forms; and (5) to gather Information useful for refining the instru- 
ment as a whole, e.g*, time required for administration. Field-testing is 
that step In developing a test which will generate the results from which 
evaluative decisions may be made. Field-testing may also be required for 
selected tests, for one or more of the purposes stated above. 

When the field test results have been collected and analyzed, their 

Interpretation can be used to refine the test items, to provide a basis 

for final test design, and to provide empirical data for selecting items 
to be used in the final test. 



« C29S 



Conducting Content Validity Review 

In recent years content validity review has become Increasingly 
Important, particularly In relation to tests which are used for certifica- 
tion, licensing, and high school graduation. Whether or not this step Is 
taken, however, will depend on the situation. 

Essentially, a group of content specialists In a particular field 
review the test Items to determine whether or not they are content-^alid. 
On the basis of the rating^ of specialists, an Item Is classified as valid 
or not valid. (See the next section, "Establishing Validity and Reliabil- 
ity," and Chapter 4, "Standard Setting," for further discussion of rating 
procedures.) Results of a content validity review can be used to refine 
the-ltems; e.g., an Item declared 1nv«vUdatt.the first review may contain 
only a minor error that can be emended. ^ * 

If a test does not require the time and expense of this somewhat com- 
plicated, technical process, e.g.. If the test Is Intended for use as a 
classroom Instrument for fourth-graders, then a content- review by special- 
ists In a particular field may be considered. This may be helpful In 
several ways: a specialist may discover som.ething In the test which Is 
outdated as the nesult of a recent discovery (e.g., the planet Pluto is 
temporarily not the farthest from the sun; thus a science test Item 
becomes Invalid). Or the expert may notice something that was overlooked 
In several reviews by staff members who have been closely connected with 
the project. A disinterested eye can often spot flaws that go unnoticed 
by others. Also, review and a "stamp of approval" from recognized special- 
ists can give the test a great deal more credibility In the eyes of the 
public and of other professionals. 



Modifying the Test. If Necessary 



The final step in developing a test is to modify the items and the 
test design itself, if necessary, on the basis of the results of the item 
reviews, the field test, and the content validity review. When this step 
has been completed, the test can be prepared for printing, distribution, 
and/or administration. 



1 



ERIC 



-80- 



^ ) 



C29S 



Establishing Validity and Re11abf11ty 



Of the many technical and legal Issues related tb minimum competency 
testing, the validity of the Instrument used to certify attainment of the 
competencies Is generally agreed to be one of the most 1mport|nt. A close 
second Is the Issu? of test reliability. The steps required fOr the pro- 
cess of either test selection or test development may be determined by the 
need for establishing validity and reliability. The Importance of these 
Issues bears a direct relationship to the seriousness of consequences to 
^ the examinees and the likelihood of legal challenge. A test require for 
high school graduation, teacher certification, or professional licensure 
Is much more likely to be challenged on legal grounds than, for Instance, . 
a test used to d1agnos6 reading difficulties among third-graders. In 
Florida, the SSAT-II, for example, which Is a requirement for high school \ 
diplomas, has been challenged In the courts ( Debra P. v. Turlington . 1978); 
as a result. It was necessary to establish validity of the instrument and 
Its reliability as a basis for decisions related to the attainment of com- 
petencies. For a test that Is susceptible to legal challenge, technical 
assistance from test developers, measurement experts, and legal experts 
may be desirable. 



Types of Validity 

The purpose of this section Is to provide some practical definitions 
of technical terms, a discussion of ways In which these Issues might 
affect a minimum competency testing program, and suggestions for proce- 
dures which can- be used to establish the validity and reliability of a 
test Instrument. 

When a chaTlenge to a test arises, the courts generally rely on the 
widely accepted Standards for Educational and Psychological Tests (APA, 
AERA, NCME, 1974) as the authoritative source on such issues as validity 
and reliability. The Standards recognize three major types of validity, 
which are defined below. 

Content validity requires that the skills, knowledge, and behaviors 
measured by a test constitute a representative sample of the skills, know 
ledge, and oehaviors in the performance domain. Critical components of 
content validity include the clear definition .of a performance domain, of 
the competencies on which the test is based, and of the method for sampl- 
ing from the domain. 



-81- 



ERIC 



S:) 



C295 

Comtruot validity refers to the ability of a test to measure the 
constructs or Intellectual concepts which It Is designed to measure. 
Examples of such constructs are reading readiness, management aptitude, 
and attitude. Establishing construct validity requires one or more pn*- 
dlctlohs about the hypothetical characteristics of examinees who score 
high on the test as opposed to those who score low, and data with which 
to prove the validity of these predictions. 

Critei»icn'retated validity Includes both concurrent validity and pre- 
dictive validity. Concfurrent validity consists of establishing the valid- 
ity of an Instrument by analyzing It In relation to a concurrent criterion, 
e.g., a student's grades or score? on an existing test already proven 
valid. Prediotive validity requites the demonstration of a correlation 
between performance on the test and degree of success In relation to the 
predictor, e.g., college entrance or job performance. 

In addition to these types of validity defined by the Standards , 
three other types are often mentioned. These Include the following: 
cuiviaylm validity, which demonstrates the degree to which a test mea- 
sures what Is purportedly taught In the schools Joften considered part of 
content validity); instruatioml validity, which demonstrates that stu- 
dents have actually been taught what 1s on the test (often considered part 
of criterion-related validity); and faoe validity*^ nontechnical, infor- 
mal term that Implies that a test looks valid. I.e., appears to be a rea- 
sonable measure of the^ desired competencies. 

Construct validity, according to Linn (1979), 1? useful In minimum 
competency testing If the Inferences made from the test results lead to 
expectations about the examinee's aptitude, e.g., a student who passes a 
test In addition Is now ready to begin learning subtraction. This type of 
validity, however, is difficult to achieve and seldom practical for appli- 
cation to achievement tests developed by educators to determine master 
nonmastery (Nassif, 1978). . 

Federal guidelines state that' a test for licensure or certification 
can be considered valid If it can be shown that the test measures a repre- 
sentative sample of the skills required in the performance of the job for 
which a candidate will be licensed or certified (EEOC, 1977). Both crite- 
rion-related validity and content validity are considered acceptable means 
by which to establish job-relatedness. This concept of job-relatedness is 
analogous to the requirement that a minimum competency test used in an 
academic setting must measure the skills and knowledge required of,a stu- 
dent to perform In school or after graduating from the school. 

Content validity can be used to support the Inference that a person 
<.who passes a test based on a clearly defined domain and made up of a 
representative sample of Items measuring that domain has attained at least 



-82- 



some degree of competency 1n relation to the skills and knowledge Identi- 
fied. This Is the most practical approach to validating a test based on 
school skills. A test based.on life skills or survival skills, however, 
may require the use of predictive validity to show that results from a 
test a student has taken In school have a definite relationship to success 
In life beyond the schoolyard. 

McClung (1977) contends that the most Important types of validity for 
a minimum competency test will more often be curriculum and Instructional 
validity. A test may be cfiallenged on a sound legal basi? If It does not 
measure what students are taught In school; a test may be challenged, 
therefore. If It measures the stated curriculum but that curriculum Is not 
actually taught In the classroom. 



Types of Reliability 

Reliability refers to the degree to which the results of testing 
are attributable to systematic sources of test score variance (Standards, 
-19?/!), IfK^other words, a test Is considered reliable If It generates com- 
parable test scores across time, across test forms, and/or across subareas 
of the test's domain. 

Reliability Is particularly Important In relation to the generallz- 
abinty and consistency of Inferences made on the basis of test scores, 
e.g., mastery/non-mastery (Hambleton i.NovIck, 1973; Linn, 1979). This Is 
perhaps the characteristic of reliability that 1s most relevant to minimum 
competency testing. 

Characteristics of reliability Include one or more of the following 
elements: comparability of test forms, which refers to the consistency of 
scores across different forms of the test designed to be parallel, or 
equivalent; Internal consistency, which refers to the correlation of scores 
between test halves or subtests In a test battery; and comparability over 
time, which refers to the reproducibility of scores on a test given more 
than once. 

In a minimum competency testing program, several situations may arise 
which would Increase the desirability of ensuring test reliability. For 
example, if a test is given to high school students, some of the students 
may fail, receive additional instruction or remediation, and then take the 
same test again or take a second form of the test. The student's scores - 
o^Iach test must be co.isistent with respect to the student's achievement 



-83- 



C29S 



In order to produce reliable results. Also, a minimum competency test may 
comprise subdomalns, such e;s reading and mathematics, and/or subareas 
within the domains, siich as problem solving and computation. The level of , 
difficulty across subareas and the reliability of scores achieved In thes^ 
subareas must be established. ^, ' • • / 



Procedures for Establishing Validity and Reliability 



Certain accepted procedures for establishing the reliability and 
validity of a test seem applicable to minimum competency testing; they 
win be described here briefly. Further assistance In these procedures 
can be obtained from the extensive literature extant on these subjects 
and through the use of professional testing specialists. 



Establishing Validity 



The most Important type of validity In a minimum competency program 
Is likely to be content validity, I.e., establishing that the test measures 
the specified domain of competencies. This Is also the most practical type 
of validity procedure to conduct In terms of cost, time, and usefulness of 
the results, particularly In a public school testing program. As mentioned 
earlier, content validity procedures stand up In court, as acceptable If 
done correctly. ^ » 

• Probably the most widely used approach to determine content validity 
Is the review of the test Items by a group of at least 10 content special- 
ists. Each content specialist reviews each Item Independently on the 
basis of four criteria: Item/competency match (whether the Item measures 
the competency, and whether the entire set of test Items constitutes a 
representative sample of all the competencies In terms of their relative 
Importance); significance (whether each Item measures a significant aspect 
of the domain of a competency); bias; and accuracy (whether there' Is only 
one correct response). In add,1t1on to these criteria, a content validity 
review may Incorporate an analysis of the level of difficulty of each item 
and Its appropriateness on a minimum competency test (see Chapter 4, 
"Standard Setting"). 



C29S 



Each reviewer should be trained 1n the rating procedure, and then 
Instructed to rate each Item Individually as valid or not valid, on the 
basis of the established criteria. Results from all of the independent- 
reviews are collected and analyzed for consistency across .reviewersf, and 
then presented in summary form for interpretation. A consensus of /the 
. reviewers is necessary to establish the content validity of an item; how- 
ever, the number required for consensus will vary with the situation (see 
• Nassif, 1979). AIT items rated as valid are then available for use on the 
final test. * . . 



'Establishing Reliability 

ReliablUty is determined on the basis of empirical data collected 
frdm actual test admlnlstratlon(s). This can be done by administering the 
same test to the same examinees at two different times (with tdo little 
time in between administrations to allow significant learning by the exam- 
inee); administering two parallel forms of the test to the same examinee^;x 
or administering two halves of the test at Once, providing that both test^ 
halves are representatl4\^e of the same domain. 

Different methods of estimating reliability are designed to account 
for different sources of measurement error ( Standards . 1974). As a 
general rule, the longer the test is (in terms of the number of items), 
the more Hkely it is to be reliable. In. many minimum competency testing 
situations, however, the- test is not usuaTly of sufficient length to 
ensure reliability on this basis alone (Linn, 1979). 



-85- 



ERIC 




C295 

/ 



Sources 

/' 

Xlests; publishe d 



Appendix A 

I 

/ 



Buros, 0. K. (Ed.). The nineteen th1rtv«e1qht mental mea surgnents 
yearbook . Highland Park, New Jersey: erypnon Press, is/z. 
tOHginally published, 1938) 

Buros, 0. K. (Ed.). The nineteen forty mental measurement s yearbook. 
Highland Park, New Jersey:, eryphon Press, 19/z. ^Originally 
publ shed, 1941) 

Buros, ^0. K. ( Ed. i. '^ Tiifev third mental measurements yearbook . Highland 
Park, New Jersey: Gt^hon Press, liJAil. 

Buros, 0. K. '(Ed.). The fourth mental measuranents yearbook . High- 
land Park, New Jersey: Gryphon Press, isw. \ /■ 

Buros, 0. K. (Ed.). The fifth mental measurements yearoook . Highland 
Park, New Jersey: Gryphon Press, 1959. ^ \ 

Buros, 0. K. (Ed.). . Tests In print . Hlgtrt^nd Park, New Oers^yt 
Gryphon Press, 1961. 

Buros, 0. K. (Ed.). The sixth mental measurements yearbook . Highland 
. Park, New Jersey: Gryphon Press, 19bt). 

Buros, 0. K. (Ed.). The seventh mental measurements yearbook . High- 
land Park, New Jersey: . Gryphon Press, 197Z. \ 

Buros, 0. K. (Ed.). Tests in p/lnt II . Highland Park, New Jersey:* 
Gryphon Press, 197?" 

Buros, 0. K. (Ed.). English tests and reviews . Highland Park, New 
: Jersey: Gryption Press, .19yb., ' 

Buros, 0. K. (Ed.). Foreign language tests and revi/ws . Highland 
Park, New Jerseyl Gryphon Press, 19/b. 



Buros, 0. K. (Ed.). Intelligence tests and revlewa^ . ftlghland Park, 
New Jersey: Gryphon Press, 1975. ~ 

Buros, 0. K. (Ed.). Mathematics tests and reviews . Highland Park, 
New Jersey: Gryphon Press, i9/a. 

Buros, 0. K. (Ed.). Reading tests and reviews II . Highland Park, 
New. Jersjey: Gryphon Press, "is/ft. 

Buros, 0. K. (Ed.). Science tests and reviews . Highland Parjc, New 
Jersey: Gryphon Press, 1975. 

Buros, 0. K. (Ed.). Social studle^ tests and reviews . Highland- Park 
New Jersey: Gryphon Press, .1975. [ 

1 ' 

Buros, 0. K. (Ed.). Vocational tests and reviews . Highland Park, 
New Jersey: Gryphon Press, 19/5. 



Tests: unpubllshecl 



I ■ 



Johnson, 0. 6. Tests! and measurements In child ' development: Hand- 
book II . San Francisco, California: Jossey-^uass. 

Johnson, 0. G., & Bopiarlto,, J. W. Tests and measurements In chUci 
development . Sjan Francisco, California: Jossey-Bass, T971. 



Test Items 



Instructional Objectives Exchange 

P.O. Box 24095 I 

Los Angeles, Califorjiia 90024 

National Assessment of Educational Progress 
700 Lincoln Tower j 
1860 Lincoln Avenue 
Denver, Colorado 80^03 



87- 



/ 



«295 



Sequencing 
/ Test Items 



Ahmann. J. $., & Glock, M. D. Evaluating du dU growth; Principles 
of tests and measurements {5th ed.). Boston: Allyn and Bacon, 



Bhjom, B., Hastings, J, T., & Madaus, G. Handbook on formative and 
summatl ve evaluation of student learninq . New York: McGraw- 



.Gronlund, N. E. Measurement and eva 
New York: Macmlllan uo., i9/i. 



uat i on In teacTiT hg (2nd ed.)« 



Henrysson, S. ^Gathering, analyzing, and using data on test Items. 
In R. LJ fhorndike (Ed.), Educational measurement (2nd ed.). 
Washington, D.C.: American Council on Education, 1971. 



Review 
Tests 



/ 



/ 



California, State Department of Education. Technical assistance 
guide for proficiency assessment . Sacramento, California: 
Author, 1977. ^ 

Center for the Study of Evaluation. C SE Elementary Test Eva luations. 
Los Angeles: Uniyersity of California, 19/u. 

Center for the Study of Evaluation. C SE ECRC Pres chool /Kindergarten . 
Test Evaluations . Los Angeles: University of California, 19/1. 

Center for the Stiidy of Evaluation. C SE RBS Test Evaluations; Tests 
of Hlgher-Qrder Cognitive. Affective^ and Interder sonal Skills. 
Los Angles.: University of California, 1972. ] 

Center for thei Study of Evaluation. CSE Secondar y School Test Evalua- 
tions . Lps Angeles: University of California, 19/4. (threp 
volumes) , * 



-88- 



ERIC 



I 



* V 



C29S 



Center for the Stud^ of Evaluation. CSE Elementar y School Test Evalu- 
ations . Los Angeles: University of California, 1976.. 

Madaus, G., Alraslan, P., Hambleton, R., Consalvo, R., fcOrlandl, L. 
Development and application of criteria for screening conwercla l 
stariaardlzed tests for the Massachusetts Basic Skill s Improvement 
Policy. Boston; Public Affairs Research institute, i9/9. 

National Consortium on Testing. Testing the tests (Staff Circular 
No. 1). Cambridge, Massachusetts: * Huron institute, 1978. 

Ohio, State Department of Education. Competency handbook . Columbus, 
Ohio: Author, 1978. 



Writing 

* 

Test items 

California, State Department of Education. Technical assistance 
guide for proficiency assessment . S acramento, caiiTorn 1 a : 
Author, 1977. 

Gronlund, N. E. Measurement and evaluation in testing (2nd ed.). 
New York: Macmillan Co., 1971. 

Ohio, State Department of Education. Competency handbook . Columbus, 
Ohio: Author, 1978. 

Wesman, A. Writing the test item. In R. L. Thorndike (Ed.), Edu- 
cational measurement (2nd ed.). Washington, O.C.: American 
Council on Education, 1971. 




J.- 



-89- 



C295 



References 



Ahmann, 0. S. Basic Issues concerning competency-based testing. In 
R. B. Ingle, M. R. Carroll, J. Gephart (Eds. K The assessment 
of student comoetence In the public schools . Bloomlngtort, Indiana: 
Phi Delta Kappa, 1978. 

Alraslan, P., Pedulla, J., 4 Madaus, 6. Policy Issues In minimal compe- 
tency testing and a comparison of Impleme ntation models. Boston : 
Heuristics, Id7d. i 

* 

Affler1can\ssoc1at1on of School Administrators. The competency moveiftent; 
Problems and solutions . Arlington, Virginia: Author, l978. 

Berk, R. A. Some guidelines for determi ning the length of objective-based 
cri terl'on-referenced tests . Paper presented at tne meeting of the 
National Council O" Measurement in Education, 1978. 

Brickell, H. M. Seyen key notes on minimal competency testing. In B. S. 
Miller (Ed.), Minimum compe tency testingi A report of four regional 
conferences . St. Louis, MissouH: CEMRtL, Im. 

California, State Department of Education. Technical assistance guide for 
proficiency assessment . Sacramento, California:.. Author, 19//. 

Candor-Chandler, C. Competency measurement at the local leyel: A case 
study of the Kanawha County Schools, West Virginia. In R. B. Ingle, 
M. R. Carroll, & W. J. Gephart (Eds.), The assessment of student _ 
competence in the public schools . Bloomlngton, Indiana: Phi Delta 
Kappa, 1978. 

Dahl, T. Toward an eyaluatiye methodology for criterion-referen ced meas- 
ures: Objectiye-ltenTconqruence l Paper presented at the meeting 
of the California Educational Research Association, San Diego, 1971. 



ERJC 



-90- 



C295 



H««bleton, R. K., & Novick, M. R. Toward an integration of theory and 
method for criterion-referenced tests. Journal of Educational 
Measurement . 1973, Ig, 159-170. 

Hathaway, M. E. Competency measurement at the local level: A case 
study of the Portland, Oregon Public Schools. In R. B. Ingle, 
M. R. Carroll, & W. 0. Gephart (Eds.), The assessme nt of student 
. c ompetence in the public schools . Bloomlngton, Indiana: Phi 
Delta Kappa, 1978. 

c* 

Linn, R< L. Issues of validity In measurement for competency-based pro- 
grams. In M. A. Bunda & J. Sanders (Eds.), Policies and problems 
In competency- based measurement . NCME, 1979. 

Madaus, 6., Alraslan, P., Hambleton, R., Consalvo, R., & Orlandl, L. 
Development and application of criteria for scre ening commercial 
st andardized tests for the Massachusetts Basic sum s Improvement - 
Policy , ftoston: PubWc Affairs Research Institute, i9/a. 

McClung, M. Competency testing: 'Potential for discrimination. Clearing- 
house Review . August 1977, 439-443. 

Mehrens, W. A. The technology of competency measurement. In R. B. Ingle, 
M. R. Carroll, & W. J. Gephart (Eds.), The assessment of student 
rnm pptpnce In the publlc schools . Bloomlngton, Indiana: fnl Delta 
Kappa, 197^. 

Miller, B. S. (Ed.). Minimum compe tency testing: A report of four 
regional conferences . St. Louis, Missouri: cOIRa, 1978. 

Nassif, P. M. Standard-setting for criterion-refere nced teacher licensing 
tests . Paper presented at the meeting of the National council on 
Measurement In Education, Toronto, March 1978. 

National Council on Testing. Testing the tests (Staff Circular No. 1). 
Cambridge, Massachusetts: Huron Institute, 1978. 



-91- 

9!) 



C295 ' 



Ohio, State Department of Education. CanDetencv handbook . Columbus; 
Ohio: Author, 1978. 

Pasch, M. MinimarcomDetancv testing: The oro blem of validity. Paper 
presented ii the meeting or the Mien can Educational Research Asso- 
ciation, San Francisco, 1979. ^ 

« 

Priestley, M., & Nassif, P. M. From here to validity: Developing a 

conceptual framework for*- test item generation in criterion-referenced 
measurement. Educational Technology . 1979, 19(2), 27-32. 

Rovinelli, R. J.\ & Hambleton, R. K. On the use of content specialists 
in the assessment of criterion-referenced test item validity . Paper 
presented at the meeting of the American Educational Research Associ- 
ation, San Francisco, 1976. 

Standards for educational and psychological tests . Washington, D.C., 1974. 

I. 

U.S., Equal Employment Opportunity Commission. Su1<*;]lrfs <>n^«;Plo^n„« 
selection procedures. Federal Register . 1978,43(166), 38290-38309. 




ERIC 



-92- 



C?9S 



4 



• 



CHAPTER 4 
' . SEHING STANDARDS 

' Paula M. Nassif 



Introduction . 



The purpose of standard setting Is to specify the score above which 
performance Is considered satisfactory and below which It Is considered 
unsatlsfactoi^ and thus to characterize the capabilities or competencies 
of each examinee. Although this score, called the pass/fall orjiutoff 
scofe. Is of Importance to every examinee. It Is also;a focus of attention 
for parents, teachers, public Interest groups, and other educators. Since 
the ramifications-legal, political, and f1nanc1al--of setting the cutoff 
score are great. It Is advisable to thoroughly consider the approach used. 

Both the procedures and the issues discussed In this chapter are 
drawn from a study of minimum competency testing programs. While the pro- 
f cedures described are those that have been or are being used to set stan- 
dards, the Issues reflect a more general and comprehens.1ve focus. At the 
state and local levels these Issues surfaced In comnlttee meetings and 
review sessions, at public gatherings and In print; not all were 
documented In state and local materials, many having been mentioned in the 
course of Interviews. As a result, the Issues represent a drawing 
together of many resources. Just as certain parameters may be taken into 
/account In formu- lating competencies and testing Instruments, so should 
they be considered when an approach to standard setting Is selected. This 
chapter will high- light these parameters. 
• • • 

The standard setting strategies that will be discussed In this chapter 
are the following: (1) administrative decision or consensus, (2) Nedelsky, 
(3) Jaeger, and (4) contrasting groups. Appropriate examples of the appli- 
cation of each model will be presented. Since the situations, resources, 
and needs of each local district or state vary so much, however, no pre- 
scrlptlve rules will be presented. Rather, the procedures r^epresent a sub- 
set of possible -procedures, the Issues a listing of those that the program 
planner may want to take Into account In setting standards. 



-93- 

ERIC ^ ^Ol 



C29S 



Issues and Parameters 

... ♦ 

Of the numerous procedures or- strategies for setting the standards 
for a competency examination, some are brief and simple, while others are 
.'complex and time-consuming to Implement. In^the past, most major testing 
programs Mere norm-referenced and the cutoff score was usually established 
In relation to the strict statistical characteristics or outcomes of the 
test. With the exception of the Nedelsky method (1954), most procedures 
connonly In use now have been developed, tested, revised, and Implemented 
In the past 15 years. And of these procedures, some have emerged directly 
as a result of needs arising from minimum competency testing programs. 

Issues In the development, selection, and/or Implementation of a * 
standard setting strategy that are currently being considered by program 
managers include: 



• legal defenslblllty 

— legal Issues ^ 

— uses of expert judgnent 

t fase of Implementation 

time/expertise available 

— reoroduclblllty of procedures 

e public accep.tanc;| 

• psychometric characteristics 

— slr.gle versus multiple cutoffs 

— whether or not to include Information about performance levels 

— classification of examinee scores 

• political considerations n 

• f1nanc1al*f actors 



The next section will discuss legal defenslblllty, implementation, 
and public acceptance. The section on strategies will present specific 
technical and psychometric characteristics of each model. Political and 
financial considerations will not be discussed. 

.■ ' K 



JC 



-94- 



C295 



4 



4 



Legal Defenslbmtv 

Legal Issues . Although each of the above factors 1s Important to 
consider in the standard setting process, preference and local need will 
dictate which Issues will assume more or less Importance. The legal 
defenslblllty jjf the test Is one Issue which deserves careful attention. . 
Recently, In some notable cases, the courts have disallowed the use of a 
licensing or certification Instrument because the cutoff score or passing 
score had been arbitrarily or capriciously established. In several deci- 
sions the courts have stated that although the required test standard or 
minimum performance level may be specified by the test user, such a score 
must bear a relationship to minimum Job performance. In other dfclslons 
dealing with the statewide establishment and use of cutoff scores, the 
courts have ruled that for a standard to be valid and therefore appro* 
priate for use. It must be Job-related and logical (Dent v. West Virginia, 
1899; U.S. V. State of North Carolina, 1975; U.S. v. State of South Caro- 
lina, 1977; Georgia Association of Educators v. Nix, 1975; Armstead v. 
Starkvllle Municipal Separate School District, 1975). 

The legal consequences for mlsclasslfylng examinees on the basis of .a 
minimum competency test may be very similar to those cited for certifica- 
tion and licensing tests. Setting standards by, a well -documented, techni- 
cally sound mi^t hod will help to avoid those potential consequences. 

Uses of Judgment . Some years ago a major topic of discussion was 
whether judgment had a legitimate role to play In standard setting prac- 
tices. In the course of the development and Implementation of many 
different measurement procedures, specialists have come to recognl^that 
varying amounts of Judgment are employed In the establishment of any cut- 
off score (Shepard, 1979). 

As a result, some researchers have reused their opinions. Jaeaer, 
who Initially classified standard setting mddels as either Judgmental or 
empirical, currently holds the fall owl ng^vlew: 



All standard setting Is Judgmental. No amount of data collection, 
data analyses and model building can replace the ultimate Judgmental 
act of deciding which levels of performance are meritorious or 
acceptable and which are unacceptable or Inadequate. ... In either 
case, subjective Judgment of merit Is Inescapable (Jaeger, 1979, 
p. 48). 



-95- 



C295 



Strategies can vary with respect to the type of judpents they require 
and with respect to the factors that judges are asked to consider. In 
general, the extent to which the nature of the judgment can be controlled 
(so as to nilnlmlie extraneous influences) enhances the defenslblllty of a 
procedure. 

As one example of a model which permits a standard to be set In an 
arbnrary and capricious fashion, Glass (1977) cites the example which he 
refers to>£ls "counting backwards from lOOJJ."' In this model, the standard 
setters specify that 100% performance on each skill or objective Is t^te 
- desired outcome. In acknowledgment of a "certain" amount of human error, 
the required performance level Is reduced fr.om lOOX to, say, 93% or 85%. 
What Is arbitrary and capricious about this procedure and other equally 
unstructured approaches is the ^Uregard for real factors and consequences 
on the part of those who set the standard. 

For example, have the standard setters considered at what point in 
"counting backwards from 100%" ordinary human error can be confused with 
failure or noncompetency? At what point In the process Is this Issue 
considered or even Identified? Is there any consideration for the ,1s,§ue 
of what percent of students will pass or fall as a result of one judge's 
estimate of error allowance against another's? 

Many educators claim that this last factor (i.e., percent of students 
passing or falling) has little to do with competency assessment. Nonethe- 
less, several other models do consider this particular. factor as a means 
. of facilitating a more focused judgment. 

Glass points out that attempts to set standards are either "blatantly 
arbitrary" (as in the above example) or "derived from a set of arbitrary 
• premises" (as in other, more structured models). Glass holds the view 
that the difficulty of setting standards well, however, does not excuse 
educators from doing so when needed; he goes on to caution: "Less arbi- 
trariness is safer." . 



Psychometric Characteristics 



The selection of a cutoff score has typically led prd^ram planners to 
consider the following issues: 

— whether to apply single or multiple cutoff scores; 



-96- 

ERIC 




— whether Information about examinee performance levels should 
be included oi; omitted; 

— whether the classification of examinee scores is correct. , 



* each of these issues will be discussed briefly. 

■ ■/ . ^ ^ ■• • . - ■ 

/ Single versus multiple cutoff scores . Fundamental to this discussion 
V of single versus multiple cutoff standards is an understanding of the 
/ difference between multiple (versus single) standards on a test which 
apply to all candidates and multiple standards which apply to different 
candidates. For an introductory discussion of the latter issue the reader ^ 
is referred to Brickell (1977). This chapter will consider only the 
former issue: whether to establish single or multiple performance stan- 
dards which every student must meet. * " 

In setting the standards for a competency test, one should first 
consider the test purpose. If the purpose is to provide diagnostic infor- 
mation (instead of an overall descrlpt ive; determination), one will follow 
different avenues In setting standards. When the purpose of the test- and 
its outcomes are clear, and these have been kept In mind throughout all 
the procedural arid developmental steps of designing the competency test, 
the process of standard setting is facilitated. In order to decide in 
favor of either multiple cutoffs or a single cutoff to determine pass/ 
fail decisions, the researcher who is considering Jte use of multiple cut- 
offs on a test may want to keep in mind the following points: 



— Requiring specified levels of performance on subtests or subsec- 
tions within an exam ensures that every examinee classified 

as competent possesses some level of competence in each section 
of the domain. 

— If subtest cutoff scores are used, the stability of each subtest 
criterion will depend upon the number of items within each 
subsection. 

— One effect of establishing multiple cutoff scores (such that 
there are various criteria that must be met from subsection to 
subsection) is that the number of candidates who pass all 
sections will be reduced. 



ERIC 



-97- 



C29S 



Th<8 cutoff- score 1n each subsection need not be extremeljrhigh, since the 
elm Is to ensure that examinees possess some level of skill on that sub- 
section of the test. * A 'trade-off is often necessary here. In order to 
keep the cutoff score from being unreasonably high on the subsections of 
the test, and yet still above the level of a chance score. In setting 
muHlple cutoff scores, too, the possibility of ml sclasslfl cation Is 
Increased with each additional cutoff "score. If there Is only one cutoff 
score on the test, there Is only one possibility for ml sclasslfying candi- 
dates: at the point of the cutoff score. If there are four cutoff scores, 
one for each subjection of the test, there are four possibilities for mls- 
classlfylng candidates. (See "Classification of Examinee Scores" for a • 
discussion of issues of misclassifi cation.) 

If there Is a si'ngle cutoff score, errors of ml sclasslfl cation may be 
reduced, but subskill performance Is not ensured. For example, a candi- 
date who achieves a high level of competency in one area of the test can 
in this way compensate for extremely Iqw performance in another area. 

In aAition, one may choose to have a combination of both single 
and multipU cutoff score methods: that is, multiple cutoff scores for 
subsections of the test as well as. a total cutoff score. Such a choice 
further increases t)te possibility. of error, however. It will, admit to the 
field of passing candidateis only t<hose who demonstrate a minimum level of 
competency in each of the subareas of the test and who, in addition, can 
meet some extra criterion in connection with the total tcfst score; since 
no compensatory performance is allowed, it is likely that, only a small 
number of examinees will pass. 

Shepard (1979) indicates that the interpretation of data or the use 
of results from a test with, multiple cutoff scores can be confounded in 
two ways: first, the cutoff scores may vary mostly as a function of 
variabiUty in the Judges' ratings and not as a function of differences in 
the Importance or complexity of skills; second, the variability in diffi- 
culty of the test items used to measure domains for which there are dif- 
ferent cutoff scores will affect the performance profile of examinees on 
those differently scored sections of an exam. 

Airasian, Pedulla, and Madaus (1978) consider the decision for a 
single or multiple cutoff score in tenrs of the uses of testing results. 
While a total test score allows for pass/fail classification, it provides 
little iflformati on for diagnosis and remediation. This information is even 
less helpful when the test measures heterogeneous content. authors 
indicate that there is no easy answer. The application of a single cutoff 
score may yield little diagnostic information, yet is clearly the easiest 

« 



98- 



«95 . 

• 'if . . 

• ' . ■ ■ ■ • .• * ■ . 

method to' administer. Multiple cutoffs, although they may Increase class-' 
if 1 cation error and Increase administration and record keeping, -generate 
*more specific descriptive Information about the Individual examinee's com- 
petency. * ^ • 

■ ■ ■ . ■ 

Inclusion of examinee performance levels . As stated "earlier, stan- 
dard setting models Involve reliance on expert judgment In setting a cut- 
off score. Even In the most structured models. Judges rely on their edu- 
cational experiences as students, teachers, administrators, etc., to help 
them set a benchmark or expected performance leVel. Another factor to 
consider is whether performance level (i.e., difficulty level ^or p-value) 
or other normative information should be Included in the procesiS of set- 
ting cutoff scores. Among programs which t-ake-into acpount this type of ^ 
information about students in setti ng^ standards are Rocky River, Ohio and « 
New Jersey. * . /** 

Despite recommendations that item difficulty should influence the 
Uutoff score (Klein and Kosecoff, 1973; Millman, 1974), several procedures 
to be considered Involve ratings which are Independent of actual examinee 
performance on the item. There are compelling arguments both for includ- 
ing and for excluding item difficulty in setting cutoff scores. 

One issue that bears on this point is the ptirpose of the testing pro- 
gram. If one goal of the program is to classify students as masters or 
nonmasters on the basis of an "ideal" level of competency, this msy decide 
the issue .of whether or not the standard, shoutd be tied in any way to ^ 
current examinee performance. When the cutoff score is to reflect an 
ideal level of competency which candidates must achieve, then current per-- 
formance information is generally excluded. 

In setting a cutoff score Independent of normative data, the judges . 
may define a standard that will result in the need for a great deal of 
/Improvement and '•hange. On the other hand, the researcher may find that 
/student performance is already quite close to the Ideal level; When the 
/ cutoff score is set in relation to an ideal level of performance, educ^- 
/ tors can claim that the performance levels required for passing relate 
directly to a defined skill level which has been determined by experts. 
Such, a standard may be said to be uncontaminated by information about 
currei^ performance 1 eve-Is which iright have led to a relaxation of the 
ideal standard. 

A disadvantage to this approach is that the ideal level set by expert 
judges may bear little or no relationship to the current performance 
capabilities of students. The judges may conceptualize minimum competency 
in terms of experts, not in terms of the current examinees, and a large 
number of students may fail. 

-99- 



C29S 



A modified version of an approach Independent of pe. jrmance )% one 
In which overall test performance Is provided as ^baseline data, (typi- 
cally, the Inclusion of performance levels Is accomplished on an Itjsm-by- 



Item basis, or on the basis of > subarea or test section,) 

Shepard (1979) claims that' If judges create their pm subjective 
models for nonnatlve data, there Is a great risk that their comparlsbns 
will not be made on the basis of representative Information, Therefore, . 
she reconmends providing representative data to the expert, judges. V 
Similarly, Conaway (1979) recommends that judges who set absolute stan- 
dards for objective-referenced ttests take the empirical difficulty of\the 
Items Into account. He states tfriat the effeot of Item difficulty on test 
scores Is "pervasive." Therefore, judges who set standards, for objective- 
referenced tests should have this Infgrmatfon about Item difficulty whfen „ 
reviewing the Hems which are the link between the objectives and the test 
scores. Very high or low standards might result If, Indeed, these levels 
reflected only the judges' requirements. Normative Information may faciT- 
'Itate the judges' task, since It can provide them with more guidance or, 
focus. Judges may be able to Incorporate .Into their decisions the f actpr * 
ofnhe .percentage of students passing at various cutoff . points (If that v 
Information Is furnished) and revlewN^helr ratings accordingly. ^ 

An Inherent disadvantage In providing such normative Information to^ 
the judges Is that they may feel Inhibited If their ratings depart from 
the empirical Information provided. One may be concerned that standards \ 
set In this way merely mirror the status quo and provide no Incentive for 
improving performance. Supplying normative Information, such as P^i^cent 
of students passing at various cutdff scores, may mean that standards will 
be set largely on the basis* of achieving a desired passing rate, without 
considering the content of the test or the level of competency deemed 
necessary relative to the domain measured. To prenyl de too much Infon - 
tlon may defeat the original purpose of the task. • 



Classification of Examinee Scores 

■■ A critical issue In setting a standard Is that of minimizing error or 
mtsclasslflcatlon. This Issue, In fact, usually takes precedence over all 
others In any discussion of standard setting, since It Is the measure ot 
success of the standard setting process. Florid^s statewide assessment 
program, for example. Is one which has paid particular attention to mini- 
mizing the risks of mIscUsslficatlon. A 



■'-100- 



i n 



C29S 



The explicit or Implicit goal of a standard setting method Is to^ 
achieve the maximally correct cTasslfl cation of examinees. If a student 
who Is actually a master of the material being measured Is classified as a 
cionmaUer, the classification error Is called "false-negative." Con- 
' versely^ an actual nonmaster of the content Mho Is, assigned mastery status 
exemplifies a f alse-posltjve classification error. 



To assess the extent to which a given standard setting methodology 
has accomplished the goal of classification, It Is necessary to ask what 
proportion of the students tested have been rightly (or wrongly) c\ass1- 
fled. The answer to this question entails reference to seme other "true" 
measure of a given student's mastery stat^is. Where .one Is unlikely to 
obtain such a measure ,^ne can appeal to ^ome Intuitive practice aimed at 
(Minimizing the proportion of students misclassif led. 

* The problem Is that It Is difficult to minimize one type of classifi- 
cation error without affecting the likelihood of committing the other type 
of error. A standard set too low, for Instance, Is one that passes not 
only true masters but also some nonmasters. On the other handi a standard 
set, too high Is one that falls not only true nonmasters, but als^ some 
masters. 

Lowering the cutoff score reduces the likelihood of committing false- 
negative classification errors (because a larger proportion of the students 
win pass). Raising the cutoff score reduces the likelihood of false- 
positive classification errors (because a larger proportion of the students 
will fall). 

Therein lies the problem. What Is the optimal polht at which the 
Standard' should be set so that an appropriate compromise Is made between 
• the two types* of errors? Should the trade-off between false-positive and 
false- negative errors ^e decided In favor* of one on the other; or. should 
one favor neither, and Instead seek an evenly balanced compromise? Asking 
how serious each type of ertor is when placed In the context of the pur- 
pose of the test or the use of test results -Is onf wa/of answering these 
questions.* . * 

» 

In practice, then, the question caii be answered In terms of the prob- 
able effects of committing the two different types of -errors. It Is clear 
. that to generate decisions of either the false- positive or false-negative . 
type could have serious implications for Individual students, for teachers 
and administrators, for policy-makers, and for testing programs as a whole. 

♦ The implications of false-negative decisions Include: (1) the psycho- 
logical and social burden to be borne by students who are incorrectly 
classified as nonmasters; (2) the culpability (either ethical or legaV) of 



lie 



-101- 



i on 



C29S 



the decision maker In such an Instance; (3) the costs (bo^Mj In dollars and 
human resources) of providing remediation to students who >do not. In fact, 
require It; (4) the costs of retaining students In grade In larger propor- 
tions than had previously been expected or encountered; and (5.) loss of 
con^l<Sence In the validity of the test Instruments and In the decisions 
emerging from their administration. 

The Implleatlons of false-ppsltlve decisions Include: (1) unfounded 
aspirations for success In competency- based endeavors on the part of stu- 
dents wrongly classified as masters, (2) unfounded expectations on the 
part of potential employers about the skill levels of such students as 
potential employees, and (3) the perception of the lay public that grad- 
uates lack sufficient coninand of skills (leading to a loss of confidence 
In the value of the high school diploma). 

The purpose of the tests or testing program and the use made of the 
results will help to determine the seriousness of one type of .error or the 
other. 

The ability of a particular -^let hod to classify examinees correctly Is 
a prerequisite for selecting that standard setting approach. It Is, how- 
ever, one that should be viewed In t6rms of the degree or extent to which 
correct classification Is maximized. There is no procedure known that 
will correctly classify 1003S of the examinees. Many cutoff score models 
approach this Issue In different ways. In some methods, such as decision- 
theoretic approaches (Hambleton & Novick, 1973), an explicit emphasis Is 
placed on controlling the amount of ml sclasslfl cation. In models which 
Involve judgments on test questions (Nedelsky, 1954), the control Is 
implicit In correct application of the model. 

Alraslan ^ al. (1978) also raise the Issue of classification accuracy 
as It relates to public acceptability. Suppose that a method of setting a 
cutoff score maximizes correct classification, but has the result that an 
overwhelming percentage of the examinees fall the test? Such a method 
risks failure not because of Its statistical weakness, but because to pass 
so few students will have enormous and far-reaching educational, psycho- 
logical, and financial consequences. 



Ease of Implementation - 

The establishment of cutoff scores generally requires not only statis- 
tical sophistication, but also an awareness of certain political, educa- 
tional, and financial concerns. In either selecting a cutoff score model 



-102- 



C295 



which has been used previously or creating an approach tailored to the 
needs of a particular situation, therefore, program planners may want to 
consider the following factors: 



— time and technical expertise: what Is available versus 
— what 4 s ne e ded; ; 

% 

— reproducibility of procedures. 



The Issue of public acceptance will be discussed separately. 



Time and technical expertise . . The developmental phases of a minimum 
competency program can be quite lengthy, particularly when they Involve 
determination, definition, and resolution of complex political .and theo- 
retical Issues. While exercising care In the determination of a cutoff 
score Is generally desirable, the task need not be so time-consuming that 
It hinders the completion of the project. Procedures that require Input 
from large numbers of people for a substantial amount of time are cumber- 
some to Implement and very costly In terms of professional time (e.g., 
Jaeger, 1978). Moreover, there are some procedures which require judges 
to make several ratings or judgnents. In such a process, the judges can 
become confused or the scores collected can be unreliable because of the 
complexity of the required task. 



Reproducibility of procedures . A feature of some competency tests Is 
that ttey measure skills In the context of "real life" situations. To the 
extent that this is true, the timeliness and appropriateness of test, items 
may be of concern and may therefore need to be reviewed periodically. 
Everf in programs in which the assessment Instrument is not written In life 
role terms, the test items measure objectives, the Importance of which may 
vary over time. Whenever the objectives and/or items change, the cutoff 
score or standard may be affected. 

In some testing programs, a desire for test security has entailed the 
generation of multiple forms of a test. In these cases and others in whic 
new tests are developed and Introduced frequently, it may be necessary to 
recalculate or reapply a cutoff score method to a new set of test ques- 
tions. In these cases, a standard setting procedure that is not unduly 
complex or expensive, but "Still sound, has been found to be most useful. 



-103- 

id 

ERJC 



C295 



Public Acceptance * 



On the basis of evidence from the program in Kanawha County, West 
Virginia, Candor-Chandler (1978) states that a primary consideration in 
the implementation of a minimun competency testing program should be 
—wh^her thr model can b6 easily understood by t^»xommunHy4 Public 
acceptance is likely to be facilitated If the appoaches taken in develop- 
ing and Implementing the program are understandable and acceptable to 
teachers, administrators, and the wider community. As a key component 
of the program, standard setting may merit particular consideration with 
respect to the issue of public acceptance. 

One notable outcome of a competency program is the number or percent 
of examinees who pass or fail. According to the example from Airasian et 
al. (1978) was cited earlier, although a standard setting approach may 
statistically maximize the correct classification of examinees. It jnav 
fail more students than the constituencies of a program find acceptable. 
Standat*d setters often take this factor into account. They may estimate 
the minimum percentage of examinee failure that the public will tolerate; 
or they may set a cutoff score to achieve a specific percent of passes and 
fails. . 

Miller (1978), in reporting on national conferences on minimum compe- 
tency testing, states that it is the process for selecting the cutoff 
score which is the key to its acceptability. Both community representa- 
tives and experts in the field can contribute important information to the 
^process. Furthermore, it is recommended that the- standard setting process 
iiot be viewed as a single or isolated task, but rather as one that should 
be reviewed from time to time by different Judges, revised on the basis of 
field data, and reconsidered in the light of possible changes in the goals 
or emphasis of a particular minimum competency program (Fremer, 1977; 
Miller, 1978; Shepard, 1979). 



Standard Setting Strategies 



In the following discussion of stan,dard setting models currently in 
use in the field, highly empirical models will be excluded. It has been 
found that highly statistical models are not feasible in actual practice 
because they require conditions which cannot be met. For example, Millman 



-104- 



C29S 



(1973) has defined a model for use only with Individual students. The 
Bnrick model (1971) requires homogeneity, an equal level of item diffi- 
culty, and equal item inter correlations. Other Bayesian approaches require 
collateral and prior information (Hambleton & Novick, 1973) which is often 
difficult to obtain. All of these empirical models also Incorporate judg- 
ments (Block, 1972; Mlllman, 1973). 

The models that will be discussed are: Y 



(1) administrative decision or consensus; 

(2 Nedelsky; 

(3) Jaeger; 

(4) contrasting groups; 



With the exception of administrative decision or consensus, the 
methods above can be classified as requiring either (1) judgnents on items 
or (2) judgments on examinees. This distinction is also used in part by 
Hambleton and Eignor (1978) and Zieky and Livingston (1977), and the reader 
is referred to these works for additional discussion. 



Administrative Decision or Consensus 



Neither the administrative decision nor the consensus method of set- 
ting cutoff scores can be classified on the dimensions of judgment or of 
statistical assumptions, because there is very little structure oc dimen- 
sionality to analyze in either approach. They are included in this dis- 
cussion because, for. a variety of reasons, they are. the methods which are 
most cohwonly employed. 

Setting standards by administrative decision means simply that the 
cutoff score is determined by one or more persons holding a position of 
authority or responsibility in a testing program. Althouah these judges 
may be capable of making an extremely informed decision, it may not be a 
decision which is open to externaTverificati on of its appropriateness. 
As a. result, a disproportionate number of students may pass or fail the 
"test befceuse, in setting the standard, there was no accommodation for the 
pass/fail rate. 

The second and very similar method for establishing a cutoff score is 
by consensus. The procedure #or setting the cutoff score may again be 
largely undefined, but the judges in this method are usually members of a 



UC 



-105- 



C295 



group which Is large enough to minimize the outlook of any one Individual. 
Also, such a group usually consists of educators representing various edu- 
cational constituencies, so that a complete array of educational beliefs 
Is brought to the Issue of setting a passing score (Wilson, 1976). 

Standard setting by administrative decision or by consensus Is popular 
for a great many reasons. As a first effort toward standard setting, these 
approaches are easy for all of the participants In a program to understand. 
They are not time-consuming or costly methods, and require no additional 
technical expertise. What these two procedures may lack In statistical 
strength they compensate for In other areas. For example, they accommodate 
certain Issues better than many other models. Financial, political, and 
public concerns weigh very heavily and are usually carefully considered in 
these standard setting processes. The Judges Involved are often acutely 
aware of the Importance of these Issues. 

It should be noted that one aspect of the consensus method, that of 
group decisions or recommendations by expert judges. Is a major component 
of many of the procedures which will be described below. Each of the 
other procedures, however. Includes structured review requirements and/or 
empirical Information. 

Setting standards by administrative decision or consensus may also 
involve considering the specific competencies to be assessed. In Vermont, 
for example, administrators prepared a list of competencies In five areas 
following statewide reviews and for each competency set an Individual 
standard. Standards below 100% were set only for those competencies on 
.which a student might make errors due to carelessness rather than lac^ or 
m.astery. Such competencies are those measuring processes (f^i'i/'^^ttng 
names of arable numerals) rather than a student's command of facts. Where 
processes are being assessed, the Department defines 80% as meaning that 
the pupil must answer correctly at least 80% of the examples. 

Administrators responsible for setting standards may also consider 
using f16ld-test data In arriving at a decision. In Maryland, for 
example. Project Basic staff reviewed the results of a field test of four 
reading competencies before setting a passing standard of 80% on each 
competency for the secondary- level Functional Reading Test. For a more 
complete listing of state and local programs using field-test data and 
specific standard-setting procedures, see A Study of Minimum Competency 
Testing Programs; Final Summary and Analysis Report (National Evaluation 
Systems, 1979). ' 



-106- 



C295 



* 



Judgments on Items 

The methods to be described here are the Nedelsky approach, and the 
model proposed by Jaeger. These are methods which require specialists to 
examine a test or Its Items and to decide on the score which a person with 
minimum competency should attain. 



Nedelsky 



One of the most popular approaches for setting standards for minimum 
competency programs Is one that was originally developed for use on exami- 
nations In medicine. The Nedeliky approach Is flexible enough for use on 
any number of test Items— I.e*, a test of any length. The ratings can be 
completed with or without normative data. The number of judges or raters 
can vary. Nedelsky's approach can be used only on multl pie- choice Items, 
for which there Is a single correct response. 

Glass (197'8) has outlined the Nedelsky procedure as follows: 



Dire ctions to Instructors 

Before the test Is given, the Instructors In the course are,, 
given copies of the test, and the fallowing directions: 

In each Item of the test, cross out those responses which the 
lowest D-student should be able to reject as Incorrect. To the left 
of the Item, write the reciprocal of 'the number of. the remaining 
responses. Thus if you cross out one out of five responses, write 
1/4. ' 

Example. (The example should preferably be one of the items of the 
test in question.) 



-107- 



k ' 115 



C295 



1/4 



Light has wave characteristics. Which of the foil oiling Is the 
best experimental evidence for this statement? 

A Light can be reflected by a mirror. 
B Light forms dark and light bands on passing through a small 
opening. 

C ^ A beam of white light can be broken Into its component 

colors by a prism. 
D Light carries energy. 
t tight operates a photoelectric cell. 



Preliminary Agreement on Standards 



After the Instructors hav€^ marked some five or six Items a 
following the directions above. It Is recommended that they hold 
a brief conference to compare and discuss the standards they have 
used. It may also be well that at this time they aaree on a. 
tentative value of constant k (see section on The Minimum Passing 
rcofe)r~Aftef~suctrxTonferihce~the Instructors should proceed 
Ifidependently. 



Terminology « r ' 

. . • • • / 

■ * J, 

In describing the method of computing the score corresponding 
to. the lowest D the following terminology is convenient: 



a. Responses which the lowest D- student should be able to 
reject as incorrect, and which therefore should be primarily attrac- 
tive to F-students, are called F-responses . In the example above, 
response E was the only F-response in the opinion of tKe Instructor 
who marked the item. 



-108- 



ERIC 



C29S 



b. Students Mho possess just enough knowledge to reject F- 
responses and must choose among the remaining responses at random 
are called F-D students , to suggest borderline knowledge between 
F and D. '■ * 



c. The most probable mean score of the F-D students on a 
test Is called the F-0 guess score and Is denoted by M^q. As will 

be shown later, Mp^ Is equal to the sun of the reciprocals of the 

numbers of responses other than F-responses. (In the example 
above, the reciprocal Is 1/4.) 



■3f 



d. The most probable value of the standard deviation corres- 
ponding to HpQ Is denoted by pp. 

•It should be clear that "F-D students'* Is a statistical 
abstraction. The student who can reject the F-reSponses for every 
Item of a test and yet will choose at random among the rest of the 
r e sponses probably doe s not exist; rather, scores- equal t o MpQ will 

be obtained by students whose patterns of responses vary widely. 



The Minimum Passing Score 



Mpo^k 



The score corresponding to the lowest D Is set equal to 
pp, where Mpp Is the mean of the Mpp obtained by various 

Instructors, and k Is a constant whose value is determined by sev-^ 
eral considerations. The F-D students are characterized not so much 
by the positive knowledge they possess as by being able to^avpld 
certain misjudgments. Most Instructors who have used the F-D guess 
score technique have felt that this ^absence of Ignorance" standard 
Is a mild one, and that therefore the minimum passing score should 
be such as to fall the majority of F-D students. Assigning to k 
the values -1, 0, 1, and 2 will (on the ^'iftrage) falV respectively 
16 percent, 50 percent, percent, and 98 percent of theT-D stu- . 
dents. ^ An Inform^id final decision on the value of k can be reached 



-109- 



ERIC 



11 



C2% 



after the Instructors have chosen the F-responses, for at that time 
they are 1n a better position to estimate the rigor. of the standards 
they have been using. In keeping within the spirit of absolute stan- 
dards, however, the value of k should be agreed on before the values 
of M|;o are computed and certainly before the students' scores are 

known. 

It Is the essence of the proposed technlquis that the standard 
of achievement Is arrived at by a detailed consideration of Individ- 
ual Items of the test. Only minor adjustments should be effected by 
varying the value of k. The reason for Introducing constant k, with 
the attendant flexibility and ambiguity. Is that F-responses In most 
examinations vary between two extremes: the very wrong, the choice 
of which Indicates gross Ignorance, and the moderately wrong, the 
rejection of which Indicates passing knowledge* ,If a particular 
test has predominantly the first kind of F-responses, tTils £^ull- 
arlty of the test can be corrected for by giving k a high value. 
limTTarTy~'Tow"value # k Will correct for the predominance of 
the second- kind of F-responses. It Is expected that In the majority 
of cases a change of not more than + .5 In the tentative value of k 
agreed upon during the preliminary conference should Introduce the 
necessary correction. It would be difficult to find a theoretical 
Justification for values of k as high as two; for more tests the 
value k « 0 Is probably too low. This suggests a rather narrow 
working range. of values, say between .5 and 1.5 with the value 
k- « l^as a good starting point. 

If a part A' of a given test consists of N^. items, each of which 
has s^ non F-responses (one of these being the right response), the 
F-D guess score for each Item, I.e., the probability that an F-0 
student will get the right answer In any one Item, Is P;^ » 
The most probable values of the mean and the square of the standard 
deviation on this part of the test are given by ' Pa^a 



-110- 



ERIC 



13 



C29S 



A • Pa(1 - Pa)Na« and pn « a A* 

Mrn must be accurately computed for each test. pQ, hoMever, may 
be given an approximate vaiule. In a test of flve-^response Items 
s may vary from one to five. If 'these five values are equally fre- 
qw^int, cn « .41 N. If, on the other hand, the extreme values,,. 
Is '^1 and s " 5, are less frequent than the other three values, as 
seems likely to be true for most tests, .41 N pd •SO N. Since 
k po Is usually much smaller than Mpo, approximations are In 
order. With , 

With k <* 1 and PO " .45 N, the equation. Minimum Passing Score « 
Rpp + .45 N, should work out fairly well in the majority of cases 
and Is therefore recomnended as a starting point In experimenting 
with the proposed te(;hn1 que (Glass, 1978, pp. 22-24). 



A daptatlon/appl Icatlon . Since minimum competency testing programs 
specify a standard not in terms of traditional D or F classroom scores, 
Nedelsky',s procedure has been adapted In a number of ways. Nedel sky's 
procedure Is also often ^coupled with the Angoff method, although the 
latter Is more typically used In setting standards for licensing ^ . ^. 
examinations. New Jersey, In Its minimum Basic Skills program, used both 
methods, but modified Nedel sky's procedure In the following way: 



.1^ 



:RJC ^^-^ 



C295 



1) The first step 1n applying the standard setting procedure Is to 
think about what you consider to be the lowest level of performance 
you are stlH winihq to classify as mastery of the skims measurea 
Ly the test that you%forKed on. If you Mve ^recent classroom expe- 
rience, "It may help you to think about students you have knowi that 
were just barely good enough to be considered masters of the basic 
skills measured- by tne test. 

Me expect that there will be some d4fferences of opinion as to what 
'Is meant by minimally acceptable performance. 

2) The second step Is to look at the first question. In the test 
and decide how many wrong answers are so wrong that e ven the mini- 
mally acceptable student would know that they are wrong . 

For example, the following question Is similar to one on the Grade 
Three Math test: • • , 

The school lunchroom. served 506 people on Monday and 315 people 
on. Tuesday. How many people wer6 served on the two days? 

(A) 191 

(B) 201 

<G) 811 - 

(D) 821 

You may decide that even the minimally competent student should 
know that A and B are wrong because the total for two days would be 
greater than the - number on any single day. But you may decide that 
wrong answer C involves an error that the minimally competent student 
would not know Is wrong. You would therefore decide that two wrong 
answers for the questions are so wrong that even the minimally com- 
petent students would know that they are wrong. 



3) We will then ask for a few volunteers to tell the groups 
which wrong answers were selected and their reasons for selecting 
them. You will be encouraged to discuss the choices. The discus- 
sions may either confirm your earlier opinions or change your mind. 



-112- 



ERIC 



12 



C29S 



4) The last step Is for you to record the numbe^r of. wrong ansfwers 
you selected as being so wrong that even the minimally qualified 
student wou1d«know they are wrong. 

5) We will go on to the next question and repeat the process. 
After you are done, we will estimate the tentative standard for 
each test based on the data you provided. 



The committees utilized tite modified Nedelsky procedure and each 
.person developed an estimated proficiency standard for a particular 
test. Next, a mean estimated standard was obtained. This mean was 
the best estimate for the proficiency standard using the Nedelsky 
procedure (Koffler, 1979, pp. 9-10). 



Applic ation . The Nedelsky model was applied by the Kanawha County 
schools In West Virginia (Candor-Chandler, 1978). Although consistency 
was found across -groups of judges who completed the. process « three 
different times, the researcher reports that the application was not 
successful. Teachers were uncomfortable wlth^the process of setting 
standards and of determining minimum competency. In addition. It was 



Candor-Chandler Indicated that the cutoff scores for Kanawha County 
were then set after a review of pre Wml nary data and consideration of 
* certain educational/Instructional factors. 





-113- 



theynodel proposed by Jaeger (1978): 

(11 1s technically straightforward, quite long, and maximizes parti-, 
clpation and Involvement of educational constituencies; 

(2) Is an Iterative process; 

(3) Involves normative data In. part of the review. 

It should be noted that this model, unlike some others, defines minimum 
competency without using that term In the body of the definition, and so 
avoids circularity. 

Jaeger proposed this method for standard, setting for the North Car o* 
Una high school competency test. To accomplish this task, 700 persons 
(registered voters, teachers, counselors, and administrators) convened ln 
groups of 50 to proceed through the standard setting mo^e\. 

Judges were first required to take the exam which they would later ' 
rate. Fj)r each Item judges were asked one. of the following two questions: 

.... 

.(1) ' Should every high school graduate be able to ans this Item 
correctly? ► « ^ 

' i ■ . . 

(2)* Jf a student does not answer this Item correctly, should s/he 
be denied a high school diploma? 

Judged next received the results of the above survey questions as well 
as actual performance data. With this information, judges were asked to 
review and revise their Initial judgments as they might consider necessary. 

Jaeger's procedure callr. for recalculation of, the judges' ratings, 
redistribution of the new ratings, and another Judgment. Judges then 
received Information on the proportion of students who wculd have passed 
or failed, as determined on the basis of the recommended cutoff scores. 
With this Infonnatlon, judges were asked to.make a final statement on the 
"necessity for each Item on the- test. 



-114- 



C29S 



Median scores were calculated by group (type or constituency), and 
the passing score was then set at the minimum median score calculated for ' 
a group. . « 

*. ' ■ 

Adaptation/application , The Gallagher report to the Nor.th Carolina 
Board of Education (1978} stated that there was a delay In setting stan- 
dards until the completion of four studies, designed to provide additional 
decision-making Infbnnatlon. The studies Consisted of: 

(1<^ a comparison of competency test results with norm-referenced 
test results; 

(2) Identification of the minimally competent and Incompetent* 
student; 

(3) teacher judgment of the tests; 

(4) a statistical study of the spring (1978) trial distributions. 



In support of (1), scores from the SHARP Reading and TOPICS Mathema- 
tics were compared to the California Achievement Test. Both the total 
score and the separate reading and math scores were reviewed for the total 
group tested and for subgroups classified by sex and race. "All of the 
results support the need to place (these) raw scores or percentage scores 
Into some more standardized set of measures that wou^d allow one to make 
some legitimate comparisons across subject areas" (North Carolina, SDE,. , 
1978, p. 11). 

For the second study, schools In a sample group were asked to Identify 
students whom they considered marginally competent and students considered 
noncompetent. The performances of 4hese students on the various tests led 
the author to stress the need for differentiated cutoff sco.as In different 
subject areas. 

The procedure used for *"he third study is very similar to that pro- 
posed In Jaegc^r (1978). Specifically, teachers and other curriculum spe- 
cialists participated in a one-day conference for the purpose of giving 
judgnents as to a minimum passing score for North Carolina on the SHARP * 
arid TOPICS tests. The judges' tasks were to: 



(1) take the test and try to see the test through the eyes of a 
competent (not superior) student; 





C295 



(2) Judge the percent of correct answers that should be required 
as passing scores for the reading and mathematics tests; 

(3) review and revise their original judgments as necessary, when 
given student trial performance data (it is interesting to note 
that the math standard was reduced, while the reading standard 
was relatively unchanged as a result of this step); 

(4) review and revise the second judpent made, if necessary, when 
given the group results on the recomnended standard. 



Gallagher notes that the ratings which teachers made fpr.the math test 
changed with the increased information provided to them at each step. The 
teachers believed that the information provided assisted them. in making 
Informed judgnents. 

The fourth study was a focused statistical analysis of the number and 
placement of items students omitted from their responses. Time and/ or 
motivation seemed to be relevant factors in accounting for the increased 
number of items omitted in the last -part of the test. 

With all of the information from the four studies, the North Carolina 
Competency Test Commission met and established the standards for the read- 
ing and math tests. '^^ 



Judgments on Examinees 



Two methods for setting cutoff scores proposed by Zieky and Livingston 
(1977) respond directly to many concerns encountered In minimum competency 
assessment. These methods, called the "borderline groups" and "contrasting 

? roups" methods, require judges to make judgments on examinees, and not on 
he test or its items. 



/ 

-116- 



ERIC 



C29S 



Contrasting Groups 

As the name Implies, the contrasting groups method Involves examina- 
tion of scores of students classified In discrete groups: thosia considered 
to be masters of the material measured by the test (for which the standard 
U to be set) and those conslxlered to be nonmasters. 

J 4 

Judges who are familiar with each student's current capabilities 
In the content of the test are asked to Identify those students who are 
clearly masters and those who are clearly nonmasters. According to Zleky 
and Livingston (1977), a minimum of 100 classified students Is needed to 
achieve a stable estimate of the standard. 

Following the test administration, the score distributions of the 
students In these two distinct groups are superimposed on each other. 
An Initial standard for the test is the Intersection point of the two 
grtpbs. An advantage of this method Is that the cutoff score can be . 
adjusted (raised or lowered) to minimize a selected error of classifi- 
cation. The following table Illustrates this method.* 




fad 



In this method, the graphic representation of "ore distributions 
mates the consideration of errors of misclasslfl cation. While It li 



♦ From Manual for Setting Standards on the B asic Skills Assessment 
Tests, by M. Zleky and 3. Livingston. Princeton, New Jersey: 
Iclucatlonal Testing Service, 1977. 



-117- 



C295 



possible ir> other models to recalculate percents of students passing or 
failing by adjustments to the standards, some researchers may prefer the 
visual presentation— an integral part of the contrasting groups approach. 

Aoplication . A procedure used by Fillbrandt and Merz (1977) to set 
standards for a California school district is similar in concept to the 
contrasting groups approach of Zieky and Livingston (1977) and the optimal 
cutting score method of Berk (1976). The researchers determined that to 
distinguish between students who are competent and noncompetent, they 
would test^wd establish standards on the basis of. the performance of 
"successful" persons in the community. Fillbrandt and Merz used matrix 
(test item and examinee) isampling to minimize the test-taking time' of 
participants selected as meeting the criteria specified for "successfully 
employed persons." 

Standards were set on the basis of the empirical results of the test. 
For example. 



Score Distributions Derived from 
Multiple Matrix Sampling 



Parameters 
Mean 

Standard Deviation 
Median 
Q + 2.09 
90th %i^e 
Xile. 
«ile' 
%ile 



75th 
50th 
25th 
10th 
Reliability 



Readinq Test 


Math Test 


25.63 


29.88 


4.63 


9.80 


27.14 


31.00 


7.27 




30.00 


42.00 


29.00 


38.00 


27.00 


31.00 


23.00 


23.00 • 


19.00 


16.00 


.854 


.916 



A cutting score of 20 was established for the reading test. This 

decision was based on plots- of the distribution which indicated that an 

asymptote was readied near the scores of 19 and 20; it appeared that below 

the score of 19 the curve flattened. Indicating that the percentages of 

those scoring at each point below 20 were about equal. In addition, the 

score of 20 represents 66.6% correct and identifies the upper 90% of scores 
(Fillbrandt & Merz^ 1977). 



-118- 



ERIC 



C295 



Two other programs that have used this standard-setting procedure are 
Kentucky and Peterborough, New Hampshire. In Kentucky, the Department of 
Education asked a representative sample of teachers to classify their stu- 
dents. Into three groups: those who do or do not need remediation; and 
thos« who may need remediation In the specific competencies. The students 
then took the screening test on whiph a standard was to be set. The stan- 
dard chosep was the point of Intersection between the scores of students 
who dp need remediation and those who may need .It. In Peterborough, New 
Hampshire, a standard for each competency In communication and computation 
was set by comparing the scores of students two grade levels ahead and two 
behind the grade level at which mastery of the competency Is expected. 

The success of this procedure has been attributed to the Involvement 
of the community and the definition of standards In terms of functional 
competencies actually needed In the job market; In addition, the complexity 
and technical detail of the study furnish very strong evidence for its 
acceptability. 

Wilson (1976) describes the use of an external criterion group, such 
as the one utilized here, as a better approach to standard setting than 
administrative decision or consensus. He also acknowledges that to use 
such a group Is more expensive and more difficult In terms of the techni- 
cal expertise wd logistics which are necessary. 

Program planners may also wish to consider a companion procedure to 
the contrasting groups method known as the borderline groups. In -this 
method at least 100 students whose performance cannot be clearly classi- 
fied as ^adequate or Inadequate are tested. The median of the scores of 
this group Is computed and used as an estimate of the cutoff score. 



********** 



Whichever method Is used, the ease of Implementation Is enhanced by 
the use of procedures that are simple yet sound, and neither costly nor 
time-consuming; both are based on the judgments of teachers wjo are 
extremely knowledgeable about stuuent capabilities. This last factor can 
present a problem If teachers are not carefully selected and trained and 
If .their judgments are not accurate with respect to the classification of 
specific students. These two approaches also rely on a definition of 
minimum competency relative to the content being tested, and not one 
directly related to the test. This definition of minimum competency must 
be applied to classify students Into groups for the statistical analyses 
required by the models. It Is therefore critical to the accuracy of the 
tests. 



-119- 



i? 



C295 



What 1$ Actually Being Done 



This chapter has cited examples of the application or inodlfled appli- 
cation of each of the standard setting procedures selected for discussion. 
These examples have been drawn from descriptions of both state and local 
minimum competency programs, in addition, the ^'2]l?'^n9.i;t>ltJ'^*l!:l^^* 
Information about the total number of programs which employ each of the 
various procedures to set their standards. 



Procedures Used In* Setting Standards* 



Procedure 


State 


Local 


Administrative Decision 


s 


6 


Contrasting Groups 


2 


3 


Nedelsky/Angoff 


1 


2 


Field Test Results and/ or . 
Other-Statistical Procedures 


9 


• 

7 


Competency Definition 


3 


2 



* From National Evaluation Systems,, 1979. The reader is 
referred to this report for additional information. 

In the table above, the procedure labeled Competency Deffnition is 
a process in which the ^tanda^d is established as Pf ^ °^,JJ«/^P«^«"^y 
definition. The procedure or method for this Is not specified. 

Similarly, several states specify standards using field test data and/ 
or statistical techniques. This in itself Is unlikely to be the procedure, 
but only a material adjunct to a process sucif as a<i"^2^strative dec sion. 
consensus, or Nedelsky. In addition, there are very few statistical tech- 
niques that generate a standard. Again, most represent a component of the 
process. Further information or details which would tie the techniques to 
a procedure were not available. 

-120- 



ERIC 



. ) 



C29S 



References 



Alraslan, P., Pedulla, 0., & Madaus, S. Policy Issue s In mlniwal compe- 
tency testing aifjd a comparison of implementation models . Boston; 
Heuristics, 1878. 



Angoff, W. H. Scales, norms, and equivalent scores. In. R. L. Thorndike 
(Ed.), Educational measurement (2nd ed«)* Washington, O.C.: American 
Council on Education, 19/1. * , 

Armstead v. Starkvi lie Municipal Separate School District, 325 F. Supp. 
S60 (N.D. Miss. 1971), Modified, 46.1 F. 2d. 276 (1972). 

Berk, R. A. Determination of optimal cutting scores in criterion-refer- 
enced measurement. Journal of Experimental Education. 1976, 45, 4-9. 

Block, 0. H. Student learning and the setting of mastery performance 
standards. Educational Horizons . 1972. 50. 183-190. 

Brickell, H. M. Seven key notes on competency testing. In B. S. Miller 
(Ed.), Minimum competency te 
ences. St. Louis, Missouri: 



(Ed.), Minimum competency testinfl;^^A re|^ort of four reg ional confer- 

Candor-Chandler, C. Competency measurement at the local level: A case 
, study of Kanawha County Schools, West -Virginia. In R. B. Ingle, 
. M. R. Carroll, & W. J. Gephart (Eds.), The assessment of student 
competence in the public schools . Bloomlngton, Indiana: Phi Delta 
Kappa, 1978. 

Conaway, L. £. Setting stwtdards in competency- based education: Some' 
current practices and concerns. In M. A. Bunda & J. Sanders (Eds.), 
Practices and problems in competency-based measurement . NCME, 1979. 

Dent V. West Virginia. 12§ U.S. 114 (1889). 

Ebel, R. L. Ess^ 1 of educational measAirement . Englewood Cliffs, 
New Jersey: Pr< itlce-Hali, 197Z. 

Fillbrandt, J. R., & Merz, W. R. The assessment of competency in read- 
ing and mathematics using community-based standards. Educational 
Research Quarterly . 1977, 2(1). 



-121- 



IC I ?0 



C295 



Fremer, J. Setting and evaluating competency standards for awarding 
high school diplomas. Paper presented at tne meeting of the 
National Council on Measurement in Education, New York, April 
1977. . 

Georgia Association of Educators v. Nix, '407 F. Supp, 1102 (1976). 

Glass, 6. Standards and criteria. Journal of Educational Measurement . 
' 1978, 15, 237-261, 

Hambleton, R. K., i Eignor, 0. R. Competency test developm ent, valida- 
tion, and standard-setting (Research Re0. No. 84, Labor atcry of 
t^sychometrlc and Evaluative Research). Amherst, Massachusetts: 
University of Massachusetts, School of Education, 1978, 

' Hambleton, R. K., & Novick, M. R. Toward an integration of theory and 

method for criterion-referenced tests. Journal of Educational Meas - 
urement . 1973, 10, 159-170^ . . 

Jaeger, R. M. A proposal for setting a standard on the North Carolina 
High School Competency Test . Paper presented at tne meeting of the 
North Carolina Assocratlon for Research in Education, Chapel Hill, 
North Carolina, 1978. . • 

Jaeger, R. M. Measurement consequences of selected standard-setting 
models. In M. A. Bunda & J. Sanders (Eds.), Practices and problems 
In competency-based measurement . NCME, 1979. 

Klein, S. B., & Kosecoff , J. Issues and procedures in the development 
of criterion-referenced tests . Princeton, New Jersey, 1973. (ERIC 
Document Reproduction Service No. TM26) 

Koffler, S. L. Setting proficiency standards: A comparative approach . 
Paper presented at the annual meeting of the American Educational 
Research Association, San Francisco, April 1979. 

Miller, B. S. (Ed.) Minimum competency testing: A report of four 
regional conferences! St. Louis, Missouri: . CEMREL, 1978. 

Millman, J. Passing scores and test lengths for domain-referenced 
measures. Review of Educational Research . 1973, 43(2). 205-216. 

Millman, J. Criterion-referenced measurement. In W. J. Popham (Ed.), 

Evaluation in education: Current applications . Berkeley, California: 
McCutchan Publishing, 19/4. 



-122- 



ERIC 



C295 



Nftssif • P. H-; Standard-setting for crlterlon^referenced teac her licensing 
tests , faper presented at the annual meeting or the National council 
onReasOl^ent In Education, Toronto, 1978 • 

Nedelsky, L. Absolute grading standards for objective- tests. Educational 
and Psychological Measurement . 1954, 14, 3-19. 

North Carolina, State Department of Education. Setting min imum competency 
standards (Report of the North Carolina Competency Test commission, 
XTI"5aTlagher, Chairman).. September 1978. . 

Shepard, L. A. Setting standards. In M. A. Bunda & J. Sanders (Eds.), 
Practices and problems In competency-based measurement . NCME, 1979. 

United States v. North Carolina, 400 F. Supp. 343 (E.D.M,C. 1975)/ 425 
Supp. 789 (E.D.S.C. 1977). * 

United States v. South Carolina, 15 FEP Cases 1196 (D.C.S.C. 1977). 

Wilson, H. A. Two" sides to tests: Positive, negative. National Assess- 
ment , 1976. *' . ... 

Zieky, M. Steps to follow and questions -to consider when setting stan- 
dards. Handout prepared fpr symposium jon Critical Iss ues in setting 
minimum competency standards. Spring 1979. 

Zieky, M., i.livlngston, S. Manual for setting standards on the basic 
skills assessment tests . Princeton, New Jersey: tducati onal 
Testing service, is//. * - 



t 



-123- 



C29S 



CHAPTER 5 

INTEGRATING TESTING WITH INSTRUCTION 
Mary F. Tol>1n 



Introduction 



The rise of minimum competency testing has spurred renewed Interest 
in currkular and Instructional Issues, ranging from speculations about 
the Impact of such testing upon the curriculum to discussions of how test 
results can be used 'most effectively. Some observers, for example, fear- 
that the Implementation of minimum competency testing proarams will lead 
to a narrowing of the curriculum, while others have speculated that an 
Increased focus upon test results will undermine credence In the profes- 
sional Judgments of teachers. 

Nonetheless, both critics and proponents of minimum competency test- 
ing suggest that a significant challenge which administrators and p roar am 
planners face 1n Implementing a minimum competency program Is to develop a 
course of Instruction for students who will take such tests In the future, 
■as.well as for students who have already failed the tests. As Ryan (1979) 
dni Shoemaker (1979) point but, a testing program will neither Improve nor 
guarantee learning. An additional problem that confronts those planning a 
Minimum competency testing program 1s to develop testing ac^lvl ties ^t hat 
are an Integral part of the 1nstruct1dnal program. The purpose of this 
chapter Is to discuss how different programs have resolved these issues 
and to suinnarize the suggestions and comments of program planners. 

A fundamental assumption of this chapter 1s that integrating testing 
with Instruction means ensuring that testing activities provide appro- 
priate Information to the personnel responsible for decisions that affect 
students, the curriculum, and Instruction. Those responsible for these 
decisions can Include, for example, classroom teachers, the schw^ or 
district curriculum coordinator, school or district administrators, and 
state-level personnel (e.g.. Department of Education staff). Just as the 
nature of their decisions vary, so will their Information needs 
differ. 

The first step to help ensure that the testing activities provide 
useful information 1s to identify who will use the test results and for 
what purposes. -This chapter is specifically. concerned with those uses 



-124- 



C29S . 



relating to curriculum and Instruction and also with those groups of 
people who typically use test results In making decisions related to 
curriculum and instruction. The first par$ of this chapter will present 
examples of the way? In which teachers, local curriculum coordinators, , 
administrators, and state-level personnel use mlnlmun competency test 
results In altering and assessing curriculum and Instruction. 

In determining to what extent test results will be used In maklna 
' Instructional and currlcular decisions, administrators and planners might 
want to consider ways to promote the use of test results by key consumers- 
(e.g., teachers, school personnel). Some programs havfe developed methods 
for encouraging the use of test results* These will also be described. 

Program planners responsible for developing Instructional programs 
both td Introduce the competencies to students and to provide remediation 
may wish to consider a variety of organizational arrangements. Programs 
that have been 1m|)lemented yield examples of possible arrangements. §ome 
options may be more appropriate to providing an Introduction to the com- 
petencies rather than remediation, and vice vers a. Factors to consider 1n 
• choosing one arrangement over another Include the number of students, the 
size of the Instructional staff, the availability Of curriculum materials 
related to the competencies, the physical facilities, the ability and 
Interest on the part of the staff ^1n providing remediation, and the possi- 
bility of using paraprofesslonals and volunteers. The second part of this 
chapter will discuss possible arrangements and how these factors Influence 
the choice of options. 

In the third part of this chapter the general Issue of how to inte- 
grate testing and instruction 1s explored from a more comprehensive per- 
spective. This discussion will treat the development of program ccm.po- 
nents and the consequences for the Instructional program. For example, 
the choice of the testing schedule can have potential consequences for the 
Instructional program, such as ensuring that the staff which Is to provide 
remediation have the necessary training to dp so. This discussion Is 
Intended to point out that In dealing with the Issue of how to Integrate 
testing- with Instruction, the methods chosen need not be limited merely to 
using test results In more and better ways. Rather, a more comprehensive 
view may be taken In which the testing and the Instructional programs are- 
designed to complement each other. 

The discussion below 1s therefore intended to provide a general 
Introduction to the following topics: the key audiences who might use 
test results 1n making decisions affecting curriculum and Instruction, 
ways to encourage the use of test results by members of these groups, 
options for organ1z1rg regular and remedial Instruction 1n the competen- 
c1es, and suggestions* for Integrating testing with Instruction based on 



-125- 



ERIC 




C295 



the design of specific program components. Where possible, ongoing pro- 
. grams will be used to Illustrate the options available to program planners. 
Tm discusston does not assume that program planners have Identified how 
t^st resuj-ts will be used, or that th6y have decided to alter the currl- 
. cule. Rather, the ways In which test'results are typically used are des- 
cribed, and general suggestions for organizing Instruction In the compe- 
tencies and for Integrating various components of the testing program with 
. the instructional program are Included. 



MCT ResuUs and Decisions Related to Curriculum and Instruction 



As noted above, key audiences who mlgiit use test results In making 
decisions concerning curriculum and Instruction include teachers, school 

' or' local district curriculum coordinators, local administrators, and state- 
level personnel (e.g., legislators. Department of Education staff members). 
Test results may be used for diagnostic purposes In working with Individual 
studenjs, for assessing the strengths and weaknesses of a particular course 
or program, or for assessing the strengths and weaknesses of a school dis- 
trict's Instructional and currlcular program'. In most cases, whatever the, 

. purpose, Aest results are used In conjunction with other Information. A 
teacher might review a student's records In reading as well as the results 
of a minimum competency test In order to determine whether the student • 
required remediation. In few Instances do administrators and planners 
consider .that minimum competency testing yields all the information needed 
to make specific decisions. 



Using Test Results for Diagnostic Purposes 



Hillsborough County, 'Florida, offers one example of a program in which 
both test results from the statewide assessments administered in grades 8 
and 12 as well as the results of locally developed minimum competency tests 
are used to identify students in need of remediation. Administrators in 
tills program have developed a compensatory education program for grades 
7-12. Once identified, students are assigned to Special classes in which 
diagnostic tests are first given to determine specific areas of weakness. 



ERIC 



-126- 



C29S 



A po5tt6st 1s also administered In these classes to measure students' pro- 
gress and these results are used to determine whether more remediation Is 
required. * ^ • ^ 

The extent to which the results from minlmun competency tests can 
yield diagnostic Information Is a subject of debate among educators since 
these tests typically indicate only whether a student has pr has, not mat- 
tered the competencies. Some have reconmended that test results be used 
primarily for screening to Identify students who require remediation, and 
that results be used in conjunction with other Indicators, such as teacher 
.Judgments. As Means points out, "If a student falls a test of minimal 
competency In reading comprehension, the presumptidn underlying the model 
Is that diagnosis of the .reading compretusnsion problems must be success- 
fully completed and that Inferences must be made atwut the diagnostic test 
^ata so that instruction can be prescribed. Yet, the task of diagnosing 
problems related to reading comprehensiw is difficult because at present 
test makers cannot factor discrete reading skills out of the tests" (Means, 
1979, p^. 5). • 

/ Means goes on to suggest that even aiven the absence of diagnostic 
information from current popular tests of reading comprehension, the 
talfented reading teacher may be able to successfully prescribe Instruction 
in /reading comprehension" on the basis of test results which merely indi- 
cate general problems in this area. Hence, one issue administrators and 
planners may face in determining how test results will be used and in 
■'developing mifrimum competency tests is the extent to which testfng both 
cm and will be used, to yield diagnostic Information. As in the case of 
tJe Hillsborough County. program, one option is to use results to identify 
students requiring remediatton, and then to obtain diaonostic information , 
useful in prescribing Instruction through other means 'e.g.. other testing, 

* dbnsultatiofn^ith teachers trained in diagnostic techniques). 

1 ■ * * " " ' • 

Ways to encourage appropri ate uses- of test results among teachers 

include workshops and staff meetings in which the uses and limits of the 
' test data are discussed. The role of teacher judgments vis-a-vis the test 

* jSata is an issue administrators may wish to consider carefully. In some 
programs (e.g., Fitbhhurg, Massachusetts) the teacher uses test Results in 
conjunction with a personal judgnent to assess the progress of students in 
basic skill areas*. Too great a reliance upon test results may lead to the 
neglect of other useful information about a student's learning difficulties 
and their causes; minimizing the-role of testing, however, may result in 

•'the program being perceived as a pointless demand upon staff time. Given 
estimates of the extent to which standardized test results are used by v 
teachers (see Gosl in, -Epstein, & HalTock^ 1979), administrators who per- 
ceive minimum cbmpeteney testing as' yielding useful 1nf/)rmat1on may, be 
Interested in trying a variety of procedures for encouraging the staff to 



C295 



use test data. Administrators may also want to consider instituting SucH 
procedures as periodic surveys or Interviews to uncover particular obsta- 
cles (e.g., obscure report forms, "lack of Interest, or hostility) that 
prevent maximum use of test results on the part of the staff. . ^ 



U sing Test Results to Evaluate Curricula ^ ' . 

.\ " " '■ 

* Minimum competency testing has been touted by some writers as a means 
of assessing the strengths and weaknesses in the instructional and curri- 
cular offerings Qf a school or district. In some programs that have been 
. Implemented, the Introduction of minimtm competency tesJ;1ng has .stimulated 
.review of the curriculum in areas in which specific competencies are 
tested, while in other programs the results have been used as an indicator 
of the areas in which changes in the curriculum or teaching methods are 
necessary. . ' 

• • ' 

In South Burlington, Vermont, the state mandate that districts assess 
specific competencies in the areas of mathematics, reading, writing, lis- 
tening, ahd speaking led to a review of the grade^l-12 curriculum in those 
areas pnior to implementation of the tes,t1ng. Administrators and teachers 
undertook this task in order to determine when mastery of each competency 
could be expected and hence, when assessments could begin; in doing this 
they alSo identified when instruction in- each competency begins. Adminis- 
trators report that staff members* share .a sense of responsibility for 
teaching the competencies, since the curriculum review and assessment 
results have indicated that each grade level, not Just the »one in which 
testing of the competency begins, makes a contribution to stydent mastery 
of the basic competencies. ^ ' * 

In some programs staff members have prepared instructional materials 
-for teaching the competencies in regular classes. Detroit provides one 
example of a program, in which city admiaistrators have prepared a manual 
that makes suggestions about' instruction. Thus, adoption of the competency 
program there has resulted in additions to the curriculum. 

Changes in the curriculum and instructional program have also been 
initiated because of the results of competency tests. Administrators in 
Peterborough, New .Hampshire, report that staff members, have, on their own 
initiative, reviewed test results and altered teaching methods and course 
materials when the results revealed major deficiencies in basic skill . 
; areas.' Administrators and program planners, therefore, may -ant to con-%^ 



-128- 



\C295 



sider whether a review of the Instructional program and the curriculum, 
♦ undertaken as part of program development or as a consequence of , the test 
•dat». Is' an activity to encourage or Initiate. 



Using Test Results for State-Level Decision Making • 



A third way In which test results can be utilized Is to assess curri- . 
culum and Instruction at the district level. In some states, test results 
are desljaed for use primarily by state- level officials. In Rhode Island, 
<for example, the Implementation of a testing program in basic and life 
skin areas Is designed to provide Information to the State Board of 
Regents on the quality of the educational system as a whole. The Board 
win use this Information In making decisions about t' allocation of 
resources for technical asti stance. In Michigan,, results of the statewide 
assessments are used to Identify school districts with large number? of 
students who are deficient In the basic skills and to allocate resources 
to Ibhese districts so as to correct and prevent these deficiencies. In 
other states, such asjorth Carolina, one use of test results is to help 
In the estimation of we financial assistance districts will receive for 
remediation. Both the number of students requiring remediation and the 
severity. of their deficiencies are taken Into account by state officials 
in allocating funds. Thus, another way In which test results are used, 
particularly by state-level personnel In making decisions related to 
curriculum and Instruction, Is as a global Indicator of the extent to . ^ 
which an educational system has achieved its goals. 

In a recent presentation, the Superintendent of Instruction of North 
Carolina suggested measures to be taken by state and local administrators 
' to encourage the use of test results. He proposed, for example, that 
reports of test results contain a: section devoted to discussing the policy 
implications of the results. Such a practice can help to ensure that the 
larger considerations are not lost in the Implementation of the program. 



/ 



Summary - / - 

/ 

This discussion is Intended to illustrate how test results can be 
used by various groups In making decisions relatec to Instruction and 
curriculum. Ways to encourage the use of test results In this connection 



-129- 

■t 



C295 



Include Informing audiences of how results maiy be used and the limits of 
the Information yielded by test data. As noted, observers of the rise of 
• minlmun competency testing have suggested that administrators and proaram 
planners consider Issues such as: (1) the relationship of test results to 
other Indicators of the effectiveness of the Instructional program, (2) the 
extent' to which the development of competency testing* will include or spur 
curriculum review, and (3) the use of test results In making state-level 
decisions about providing technical assistance and/ or funds. 



Program planners at the state and local levels have Identified a 
number of arrangements for introducing the competencies to students and 
for providing remediation. Detroit school officials, for example, have 
developed a program manual In whith they list ways to organize Instruc- 
tion. In addition, other writers have suggested general guidelines for 
developing competency-based Instructional programs, particularly those 
designed for remediation. This section will discuss options for organiz- 
ing Instruction In the minimum competencies drawn from the work of Detroit 
administrators and other program planners, noting the guidelines suggested 
by various writers. In addition, this section will also describe factors 
that can Influence the choice of Instructional program. 



Creating Special Clas?t^s 



One way to begin teaching competencies to students or to provide 
remediation is to create separate classes and instructioqal materials. 
Students could attend these classes in order to learn specific competen- 
cies, while remaining In the regular program in other areas. Because an 
students are generally subject to the same competency requirements, pro- 
aram planners who have chosen this option have, in most cases, created 
special classes for remedial purposes, finding it more feasible to inte- 
grate the initial teaching of competencies with the regular Instructional 
programs. Assignment to the special remedial classes is, as in the case 
of Hillsborough County, an automatic consequence of failing the state or 
local minimum competency test. 



Options for Organizing Instruction and Remediation 




-130- 



ERIC 




C295 



Hillsborough's testing program does provide a cautionary example for 
administrators and planners who opt to provide remedial Instruction 
through the creation of special classes. In a class action suit brought 
against both the Hillsborough County School Board, the Superintendent, and 
various state officials and groups, the plaintiffs claimed that the crea- 
tion of the ccmpensatory education classes had resulted In a resegregati on 
of the public schools. In a ruling handed down In July 1979, the Judge 
determined that although the classes were populated by a majority of black 
students, the program allowed the students ^asy access back Into the 
regular Instructional program 1f they demonstrated mastery of the requi- 
site competencies. Moreover, the purpose of the proaram was to remedy the 
educational deficiencies which were a result of previous segregation. An 
Issue, then, that administrators and planners may want to consider, 1f 
special remedial classes are created, 1s how to ensure that students can . 
move easily between remedial and regular Instruction. 



Establishing Resource Centers • . 

One alternative to special classes Is to create centers where stu- 
dents can go for assistance 1n mastering specific competencies. In Omaha, 
Nebraska, for example; a student who has missed a specified number of com- 
petencies may go to a mathematics laborator^y for assistance. Activities 
In the lab Include working with instructional materials geared to those 
competencies or seeking help from the resource person, who is usually a 
mathematics teacher on the staff. Administrators responsible for staffing 
such centers or labs might wish to consider the possibility of employing 
paraprofessionals or parent volunteers. 

Administrators in Detroit suggest utilizing the competency lab to 
provide more formal instruction to students. For example, lab instr4Jctors 
could teach mini-courses covering one or more competencies for students 
who were unfamiliar with them or had failed to demonstrate proficiency. 
Under this arrangement students could remain in their regular classes with 
the exception of brief periods during which they would attend the lab for 
Instruction. 



-131- 



C295 



\! 



Tutoring 



Tutoring Is another way to provide regular competency Instruction 
and/ or remediation. Students who have mastered the competencies can tutor 
Students who are Just learning the material. It may, of course, be rhore 
feasible to Introduce all students to the competencies at the same time; 
In this case, small tutoring groups may not be the most effective strategy 
to select. 

If tutoring is selected as a way of providing remediation. It can 
occur both Inside and outside of regular classes. If the number of stu- 
dents requiring remediation Is small, then tutoring might be more pihactl- 
cal outside the regular classroom. For example, a nonprofit, nonpartisan 
organization In New York City, the Public Education AssocUtlon (PEA), 
organized a volunteer tutoring program to help New York City high school 
seniors pass competency tests by June 1979. The competency requirement 
was the result of a 1976 resolution by the New York Board of Regents, and 
by February 1979 approximately 15X of the seniors In New York City, had nipt 
passed the tests In reading and mathematics. The Public Education Asso- 
ciation, In. conjunction with other Interested organizations, recruited and 
trained adults as tutors. Students were tutored on a one-to-one basis In 
the high schools during regular school hours when possible. Tutors also 
utilized other facilities, e.g., community centers and libraries. If 
needed, and on the average met with students twice a week for one hoUr. 
PEA used a variety of media (such as radio, television, leaflets, and 
newspaper articles) both to recruit tutors and to Inform students. 

Conmunlty centers are the sites used for remedial tutoring In 
Charlotte-rtecklenburg, North Carolina. In this program, tutorial centers 
are open after school for Interested students. The centers send contact 
'persons to Inform students who have failed either the state or local com- 
petency testis of the tutoring available at the centers. 

With respect to community centers, MCT program planners may want to 
consider supplementing regular school Instruction In the required compe- 
tencies with tutoring provided by paraprofesslonals or volunteers at such 
centers. Competencies that require practice work or close <non1 tori ng In 
order to achieve mastery might be Introduced In the school, but practiced 
outside of school. Teachers In Peterborough, New Hampshire, \devel oped a 
booklet on the essential competencies for parents of elementary students. 
This booklet was designed to explain the particular competenclfes; It also 
suggests activities a parent can do at home with the child to facilitate 
mastery. These activities are Intended to supplement the Introduction to 
ihe competencies a child receives in school. 



-132- 



erJc' 



l4n 



C29$ 

ii * * 



'Individualized Instructio n 



Another way of providing Instruction or remediation that, like tutor- 
ing, can occur within the reiular program Is to have students work Indepen- 
dently with self-paced learning materials. These materials can be locally 
developed for specific Instructional or remedial purposes, or state- and 
district-developed exercises that have been prepared for teaching the com- 
petencies may be adapted fpr remedial ua^. Both Detroit city administra- 
tors^^ Vermont Department of Education staff members have developed 
-dfEalled suggestions for teaching the competencies. Local school officials 
may find their Instructional materials useful. 



Choosing the Appropriate Arrangements 



Factors that will Influence the choice of remedial and Instructional 
options Include the number of students expected to participate, the size 
of the Instructional staff, the availability of curriculum materials, the 
physical facilities, the training and Interests of staff members, and the- 
availability of paraprofesslonals and/or volunteers. As mentioned above. 
Introducing the cempetenclef may be more efficient and cost-effective If 
done In the context of the regular program of Instruction. In cases where 
mastery of a competency requires close supervision of a student's work or 
the time spent In class does not permit all the necessary drill, program 
planners may want to consider supplementing such Instruction by using 
paraprofesslonals or volunteers Inside or outside of the school. Thus, 
tutoring would be one way of providing additional Instruction, as would 
providing the student? with curriculum materials geared to the competency 
for Independent review. 

Real differences emerg§ when these options, considered as remedial 
strategies, are compared on the basis of the factors listed above. 



ERLC 



-133- 



C295\ 

• \ • 
Creittinq Special Classes 

6^ven a fairly large number of students who are approximately similar 
In ability, compensatory education classes may be the most efflQlent way 
to provide remediation. Establishing special classes does entail ensuring 
that the staff has adequate preparation to provide remedial Instruction. 
This option also entails having sufficient room to accommodate the newly 
created classes. Demands on the staff could be reduced If paraprofes- 
slonals are Included as part of the Instructional staff. 



Establishing Resource Centers 



This option makes similar demands upon staff time and the physical 
facilities. The availability of curriculum materials might help to offset 
demand for staff time, especially if resource instructors served primarily 
to refer students to materials rather than to provide actual instruction. 
Using piraprol essionals or volunteers to staff centers would also reduce 
demands upon tne local staff. 



Tutoring 

• ^ * ' 

As a way of providing remediation, tutoring may be most effective 
given relatively small numbers of students needing close supervision. If 
persons other than teachers or other staff members serve as tutors (e.g., 
volunteers, parents, peers who have mastered the competency), this arrange- 
ment requires a smaller amount of staff time to maintain. The availability 
of curriculum materials could enhance the effectiveness of the tutors, 
particularly if they received training in specific remedial techniques. 



Individualized Instruction - „ • 

This arrangement potentially places the least demand upon staff time 
and.facllitles. The quality and comprehensiveness of available materials 



I, 

-134- 



C295 



will, of course, affect the extent to which .students will need the assis- 
tance and attention of the teaching staff. In addition, staff members may 
require training In using such materials to teach the competencies. Para- 
professionals or volunteers might also assist students In using self-paced 



materials* 



Summary - 

I 

The selection of an option Is always, of course, the result of trad- 
ing off factors suc^i as the ones described above. 'Furthermore, no matter 
what arrangement Is djosen, some writers suggest two more general guide- 
lines: (1) that deficiencies be remediated at the earliest possible 
Instance In the curriculum, and (2) that. If remediation 1s provided at 
the secondary level, students be given opportunities to. participate ln the 
regular Instructional program. The Massachusetts RIght-to-Read Committee, 
for example, asserts that "students must be taught the skills and kinds of 
knowledge which the tests call for, and remedial Instruction must begin as 
soon as students show they have fallen behind In their progress toward 
mastery of basic skills" (Sllngerland, 1978, P- 12). ^Sneaking to the 
Issue of remediation at the secondary level, Ryan (1979) suggests .that 
remediation be "supportive, not demeaning; that . . . [It] be appropriate 
to the age level of the student and conducive to the development of self- 
esteem" (p. 17). 



Integrating the Testing Program with 
Curriculum and Instruction 



Apart from determining how test results will be used and by whom, how 
"to encourage their use, and how to structure basic instruction and remedlar 
tlon, program planners and administrators can further ensure the Integra- 
tion of testing with Instruction by considering the development of each 
program component In light of Its Implications for the Instructional pro- 
gram. This section will briefly discuss three specific ccmponents-the 
minimum competencies, the test Instruments, and the testing schedule— and 
how their design can affect Instruction and curriculum. The purpose of 
this discussion is to underscore how a concern with integrating testing 
with Instruction can underlie the entire process of program development. 



-135- 



\ ••• 

C295 ' 

\ 'V ... 

) - 

The Competencies and instruction ^ ^ 

■. " ■ . ' ♦ 

• - ' . • . 

ATthough the procedures for defining competencies are discussed in 
another chapter, this chapter will consider this component from the stand- 
point of the relation between Instruction and testing. In some competency 
testing programs, administrators and program planners have written compe- 
teiTtles In order to make them easier for teachers to understand and to 
teaeh. In Detroit, for example, city administrators have prepared a pro- 
gram manual In which each competency Is carefully defined and ways of 
teaching the competency to students are described. Similarly In Vermont, 
State Department of Education staff have developed a handbook for teachers 
that describes ways of teaching competencies In reading, writing, mathema- 
tics, listening, speaking, and reasonlna. Competencies written with a 
view to their comprehension and teachability will ensure that the program 
components will mesh with the instructional program. 



The Test Instruments and Instruction 

Testing activities can also be made an Integral part of instruction. 
For example. In sane programs evidence of proficiency Includes course work 
or extracurricular Involvement. In Omaha, Nebraska, students demonstrate 
prof1c1<incy In problem solving by defining a social problem In a required 
history course and then proceeding to follow a six-step process to solve 
It, Steps Include proposing and researching a solution. Students first 
solve such a problem as a class homework assignment, and then choose ^a 
different problem for solution in order to demonstrate competency. 

In St. Paul, Minnesota, students attending the St. Paul Open School ^ 
can assemble a portfolio to demonstrate competency In areas such as career 
education, community Involvement, and consumer awareness. Such a port- 
folio may Include letters of testimony from employers and personal accounts 
of work experiences. 

The National Education Association, Interested 1n encouraging the use 
of Indicators other than standardized test results, has cited a variety of 
options for educators to consider. In their handbook Alternatives to 
Standardized Testing . Quinto and McKenna (1977) suggest contracts, con- 
ferences, and teacher-made tests as ways of assessing proficiency. While 
the authors address the more general Issue of how to assess student pro- 
gress, their discussion Is relevant to the Issue of how the minimum com- 
petencies may be assessed. Their suggestions for alternative means of 
assessment may provide a way to better integrate testing and instruction. 



ERIC 



-136- 



C295 



The Testlhg Schedule and Instruction 



To give yet another example of how a program #^ponent can be viewed 
In terms of the "relationship between testing and Instruction, consider th^ 
Issue of how to determli® the testing schedule. In Vermont, for example, 
the state specifies the competencies to be assessed but does not specify 
the testing sdiedule. Rather, the state stipulates that bealnnjng with 
the class of 1981, studentslnust master competencies In particular, areas 
In order to graduate. To determine when to begin assessing students on 
the basic competencies, administrators In South Burlington, Vermont, along 
with a group of teachers, conducted a.currlculum review. The purpoSfe of 
the review was to find out when Instruction In each competency began, and 
to estimate when a student could be expected to have mastered e^th compe- 
tency.- The point at which mastery is expected Is the point at^>wh1ch the 
student Is first assessed on ,the ^peterjcy. In this program, testing 
activities were keyed to th4i Instruction^ 

In addition to considering the option of relating the testing schedule 
to Instruction, administrators may also promote the Integration of testing 
and Instruction through carefully welahing the potential Impact of the 
testing schedule upon curriculum and Instruction. One such consequence,^ 
If the numbers of students requlrl no remediation Is high, might be a need 
for teachers with special tralTiIng In teaching the competencies at an , 
appropriate level. For example, since Introducing minimum competency 
testing Into the high schools, school administrators In Gary,. Indiana, 
have hired teachers who are trained In providing remediation In the basic 
skills to high school students. These administrators discovered that many 
secondary teachers either were not trained to teach basic skills In high 
school or were not Interested In teaching remedial classes. Instituting 
remedial classes at the high school level meant hiring teachers specifi- 
cally to teach ranedlal courses 1n basic skills such^ as reading. Thus, 
the selection of a testing schedule may result in special demands being 
made upon the talents and Interests of the staff. 



Summary 



The above examples are Intended to Illustrate the point that a con- 
cern for strengthening the relationship between testing and Instruction 
need not be limited to considering how to promote the effective use of 
test results and possible remedial strategies. This concern is an appro 
priate one for all stages of program development. 



-137- 



lis 



References 



\ 



Detroit Public Schools.* Detroit^ High School Prof Ic lencv Proqran; Program 
. manual . Detroit, Michigan: Author, 1979. ; ■ . 

. . Goslln. D. A., Epstein. R.. & Hal lock. B. A. The use of standardized tests 
in elementary schools (Second Technical Report). New York: Russell 
Sage Foundation. I96d. 

1? Means. H. J. Reading and minimal canoet encv testing. Paper Presented at 
the annual meeting of the American. Educational Research Association. 
San Francisco. April 1979. 

- ' ■ . ■ t 

Phillips. A. C. State and federal roles In testing: As vla^ed by the 
state superintendent of North Carolina. In R. M. Bossone (td.j.. 
P roceedings of the Second National -Conference on Testing . New 
York: Center for Advanced study in Education. 

• ■ ' ■■ ■ ■ - ^ ■ i 

Qu'lnto. F.. i McKenna. B. Alternatives to standardized testing. 
Washington. D.C.: National Education Association. 1977. . r 

\Ryan. C, The testing maze . Chicago: National PTA. 1979. 

Shoemaker. 0. S. Minimum Competency Testin g: Implications for Instruc- 
tlon. (Unpuliihhed paper) Washington. D.C.: National Institute of 
Icfucation. 19^. ^ • 

Slingerland, J. The minimum competency movement . Chairman's Report. 
Massachusetts Advisory Council for the Pight-to-Read Effort. 
Boston. 1978. 



-138- 



C295 \ 



CHAPTER 6 

a 

PROGRAM MANAGEMENT 
wmi«n Phillip Gorth and Peter E.. Schrlber* 



Introduction 



This chapter will present a set of preliminary procedures for prepar- 
ing a management plan for a minimum competfccy testing program. In the 
oreparatlon of 'such a plan, the specification of personnel needs and. the 
determination of costs will play significant roles.^SI nee budgetary con- 
Walnts affect every x:omponent of a program^ this discussion win touch 
Son the nature of the^:osts which an MCT program Is likely to entail. It 
shbuid be stressed, however, that neither specific costs nor estimates 
win. be offered to the reader. 

in addition, those respons'lble for the planning and management of an . 
MCT proWam will find It necessary to locate and Identify personnel to 
^pefform the many tasks which the program may require. Consequently, guide- 
lilnes ^nd Strategies for meeting personnel needs will also be discussed 
here. \- '\ • 

' This chabter essentially provides a repertory of procedures and strat 
eqles from which , educators responsible for program management can draw at 
. will. This presentation will not exhaust all possible alternatives and 
will not prescribe specific techniques or modes of organization; It is a 
Possibility that nbne of the procedures under discussion will be apposite 
•to a particular program. It Is hoped, however, that even In such a case, 
this discussion will be useful In that It may stimulate educators to look 
•at. their management needs In a fresh light as a result of the consldera- 
tjl bns 1 ntroduced here . 

' • The topics brought forward In this chapter have been selected because 
they are the Issues which seem to be of the greatest Interest or concern 
to those responsible for the design of management plans for competency 
programs. In the course of the discussion, examples will be drawn from 



* With organizational assistance from Dolores R. Harris. 



-139- 



C295 



^various programs. This practice, however. Is In po way an endorsement of 
a particular procedure; these examples have been chosen only to Illustrate 
a point more clearly and to sugaest the wide range, of solutions possible , 

for each of the problems or topics under examination. 

f ■ •• • 

As a step preliminary to planning for an MCT program. It has been 
found useful to establish a center of control or focus of responsibility 
for the activities which are to be undertaken. In all 52 of the programs 
of the study— whether Initiated at the state leveT or the local district 
level, whether Initiated by legislative mandate, by the action of a state 
board, or at the direction of a local agency— the control and administra- 
tion of the program had been delegated to a single agency or Individual 
that assumed all responsibility for 
all the activities which the program 

of arrangements for this purpose wei^e encountered. Throughout this chap- 
ter on planning It will be assumed that the center of control and respon- 
sibility for an MCT^program has been established, and, for. the sake of 
•convenience. It will be assumed that this center of control resides In 
person of a program director. However large or small the program, the 
duties and functions of such a program director remain essentially the 
same from program to program. SI nee this role Is. such an Important one. 



;ed to a single agency or Individual . v 
planning, coordinating, and managing • > 
J canned for. In the field, a varfety 



tfhe 



It 



may be worthwhile to consider the functions of the director and the quali- 
fications which, might equip a candlxlate to occupy this position and carry 
out Its duties successfully. . ' * 

r 

In a minimum competency testing program the director occupies a posi- 
tion which Is Intermediate between the Initiating on policy-making bodies, 
and the const1tuenQ»1es that will be affected either .directly or Indirectly 
by the program. Important qualifications for the director, therefore, may 
be the ability to understand the diverse viewpoints and concerns of these 
groups, conjoined with the ability to find common or unifying themes In 
this diversity which will facilitate the task of Implementing the program. 
To Increase the likelihood of accomplishing the program goals, the director 
might best be dr«wn from a pool of candidates familiar with a given educa- 
tional system and with the conmunity it serves. Experience in educational 
Dlanning and admini strati cm_-and demonstrated ability to organize and direct 
groups are also extremely desirable attributes in a prospective program 
director. 

As additional sources of information on the duties and qualifications 
of a program director, the reader may wish to consult the Competency Hand- 
book of the Ohi o Department of Educati on and the California Technical 
)^sUnce Guide for Proficiency Assessment , both of which appear in the 
reference list at the end of this chapter. 



RIC 



-140- 



In order to assess the. planning and management nedds of a minimum 
competency testing program It may be helpful, as a'flrst step, to prepare 
a full account of the stated purposes ahd goals of the •program, and 'a list . 
of the prescribed activities through which these* goals are to be realized. 
This procedure can clar1f>»the nature and 'ex^nt of the. twk, since It 
win delineate the essential and Irreducible structure of the program. 
This essential structure or fortn will, of course, vary from proaram to ' 
program. In some Instances, as In certain statewide programs, for example, 
thftTpbllcy-maklng body has not only Initiated the program, but Has also 
specified Its components ^fi detail. Such programs may present the planner 
with a set of competencies* an established testing schedule*, predetermined 
target groups, approsfed testing, Instruments, prescribed standards, and 
explicit directions for generaltl ng reports, of the- test results and for the 
uses of these test t'^ults— both In me^ktng decisions abput students and In 
supplying information to the puj^llc. 

At the other extreme, some state and local prodrams havebeen formu- 
lated In the b'roaelest terms. possible, leaving decisions on thj»e and other 
•Issues to the discretion of the Individual agency, and, in effect, to the 
program director or planner. In either case, however, this first proce- 
dure will establish all and only the essential elements of the MCT program. 

- It may be useful at this point to categorize these elements as belong- 
ing to.one of three program components. . (X) Instruction, (2) testing, and 
(3) remediation. By definition, all MCT programs will have a testing com- 
ponent. The Inclusion of one or the other,, or both, of the remaining two 
components appears to be an optional feature. It^may be wtse to point out 
here that the adoption of this mode of categorization does not raean^hat 
ill of the components in a program are of equal Importance. .For example, 
one program may key Its testing to the curriculum. So that the cUwIculum 
components. will define the domain of the testing. components; this has been 
the course followed In the MCT program In South Burlington, -Vermont. In ' 
yet another program, the reverse may be true: the testing component wIlV 
establish the desired educational results, requlrlna the adjustment of 
■ redesign of the curriculum component. The program in Peterborough,' New 
Hampshire exemplifies the second configuration. 

After the essential structure of the program has been outlined. Its 
various elements .categorized under the appropriate components, and the 
hferarchlcaf order of the components determined. It may then be possible " 
to specify the tasks-necessary to Implement each component. To character- 
ize the nature of each task Identified, It Is often helpful to ask a set 
of questions which will determine the procedures and resources necessary 
to accomplish that task. For the purpose of discussion. It will be assumed 
that the task Is a unitary one, which cannot be broken down Into subtasks. 
Some appropriate questions might be grouped, as follows: 



-141- 



C29S 



SEQUENTIAL ORDER: 

♦ t 

— What tasks, If jsny, must precede this one? 

* — What tasks must follow? 



METHOD: * 



— What methods can be employed to accomplish this task? 

— What methods are available for use In this program? • o 

— What method Is the most feasible for this prograr? 

RESOURCES: ' 
Personnel ' 

— What personnel does this task require? 

— What personnel . are available? 

Expertise 

— What kind of expertise does this task require? 

— What expertise Is available? 

Time 

— Hol^much time will the task requTre? 
r- How much time Is available? 

Funds 

— What expenditures are necessary for the task? 

— What funds are available? 

These questions point up the fact that In planning and managing a program, 
the Issue of prograni needs versus the availability of resource! needs to, 
be considered at eve^y step. / 

Since specific. tasks,, such as Identifying the competencies, test 
development, standard setting, and dissemination, are dealt. with In other, 
chapters In this document devoted to these topics, the remainder of this 
chapter will concern Itself with a discussion of two subjects: personnel 
resources and the wiys In which- a program director might develop and employ 
these resources to their maximum effect^ In order to achieve the stated 
goals pf the program; and the costs which an>MCT program may 'nvolve. . 



-142-. 



C295 



Personnel 



\ 



A program director may wish to call on both internal and external 
sources to satisfy the personnel needs of a given MCT program. Internal 
-screes 1flcl«^ the teacMng staff, ^4#1strative *nd technical staff , 
and clerical staff employed by ar\ educational system. These staff members 
will be the most likely source of personnel for tasks which, for their 
accomplishment, require specific knowledge of content areas , methods of 
testing and evaluation, and curriculum design. -In some programs, the 
local district may also be In a position to draw upon the expertise of 
state-level specialists to assist them in these matters. Internal staff 
often play an Important role In test development In programs which engage 
in this activity. Also, members of the teaching staff usually administer 
tests, and frequently score them. Remediation, reporting results, and 
dissemination are other activities in which internal staff may participate, 
conditional upon the design of the MCT program. ^ 

External resources for personnel may Include outside educational con- 
tractors, consultants, or specialists called in to assist with one or more 
components of the MCT program. Their use. is often dependent in large part 
not so much upon need as upon the availability of funds for this purpose. 

A very Important source of external personnel is, of course, the com- 
munity which an educational system serves. It appears that the most 
successful programs of the study. In many cases, were those which engaged 
a broad representation of community members in the tasks of program devel- 
. oproent. Active Involvement seems to generate support and enthusiasm for 
a program which can act as a powerful catalyst. 

The formation of a committee is the most usual method by which com- 
munity menbers are drawn into active participation in an MCT program. A 
review of the programs of the study will reveal the wide variety of acti- 
vities and tasks which such committees have undertaken in the design and 
implementation of MCT programs at both the state and local district levels. 
And it has been observed that communities are more prone to accept changes 
in their school systems if they are not only informed but also involved in 
the process" (California, SDE, 1977, pp. 111-6). 

There are at least three kinds of committees which can be employed in 
MCT programs: 



(1) ADVISORY COMMITTEE — represents wide range of interests and 
reviews general program policy. 



ERIC 



-143- 



151 



i 



C295 

(2) STEERING. COMMITTEE — deals with detailed aspects of program 

policy, may prepare draft versions of policy statements, and <ft 
may have memljership which Is a subset of the Advisory Committee. 

(3) WORKING COMMITTEE — one or more Working Committees may be 
established to accomplish specific tasks necessary to Implement 
the MCT program and may have membership which overlaps partially 
or not at all with the Advisory Committee (Ohio, SDE, 1978, 

. pp. 2-4). 



A review of the state materials prepared t^ assist planners with MCT 
program developnent shows widespread agreement on the considerations which 
are especially relevant to the formation of such committees. 



Committee Comnisition 



It is recommended that the composition of the committee be carefully 
planned, appropriate to the tasks it will be assigned, and representative 
of the commtan+ty affected by its work, whether that community Is defined 
by the geographical boundaries of the school district or of the state. 
The committee, if It- represents a cross section of the community, can make 
it possible to gather information about all constituencies as to what they 
want, approve of, understand, and will support. The extent to which these 
different constituencies are involved in the MCT program may determine the 
extent to which the results of the program will be supported by the com- 
munity. It is important to realize that special interest groups within 
the school may be as important as those in the community; therefore, the 
members of such groups, will also be desirable as committee members. 



Committee Selection 



It is recommended that a selection strategy be explictly determined 
by the local or state board or superintendent and implemented by the pro 
gram director. One strategy is to appoint individuals who have been 
active in school affairs. However, if all the members are selected in 
this way, the committee may not accurately reflect the community. A 
second strategy is to set guidelines for selection in order to achieve a 



ERIC 



-144- 



C295 



balanced membership. A third strategy Is to solicit the participation of 
coBWunlty members by open invitation, which allows for greater conmunlty 
Involvement. It Is possible also that a combination of these strategies 
can help to obtain members from all the Interest groups crucial to program 
success. At the beginning of the selection process It may be wise to 
^phaslze to prospective members that they will be expected to serve 
actively.' 



t 

Comml ttee Functions 



The functions of the committees versus those of the program director 
may need to be clearly differentiated. Although each committee may be 
generally considered as advisory In nature, a committee can assume a 
dec1s1 on-making role as a primary voice to the coirmunlty or as the techni- 
cal experts In a particular subject. Therefore, the committee may be use- 
ful as a forum for sounding out ideas or for defining and selecting alter- 
native approaches at every stage of program development. . 



Committee Size 



The nature of the task which a committee 1s to perform will very often 
determine its size. If thfe committee 1s an Advisory Committee designed to 
represent community interests adequately, 1t may very well contain 25-50 
members (e.g., Massachusetts Statewide Advisory Committee). A Steering 
Committee, on the other hand, may require only 5-10 members to handle 
material development effectively (cf. the Detroit Public Schools program). 
Working Committees usually require 5-12 members to represent the various 
professional opinions adequately. \ 



********** 

\ 

\ 

For further information on this subject, the reader may w>s.h to 
consult the materials prepared by the California, Illinois, and Ohio 
Departments of Education. These handbooks present useful information, 
organizational charts, and strategies designed to assist program planners 
in meeting the personnel needs of their programs. 



145- 



C295 



It Is also possible to achieve community participation by other 
means. In some programs public meetings have been convened for the pur- 
pose of permitting members of the community to express their views ^ut 
the -minimum competency testing program. If possible, several such meet- 
ings could be held In different locations in order to reach as many people 
as possible. Meetings may also be scheduled for such groups as business 
and professional org4jn,i zati ons , trade unions, associations of parents and 
advocates of studentslwith special needs, and ethnic and cultural organi- 
zations. It is advisable to prepare carefully for such presentations, 
since they will usually serve a dual function; not only do they permit 
educators to collect Information about the concerns and needs of community 
members, but such meetings also provide the educators with an opportunity 
to inform the public ^bout the goals and. purposes of the MCT program. 

The survey is another useful method for reaching the public. It may 
be a comprehensive survey, such as that employed in the Detroit program, 
in which completion forms printed in the local newspapers solicited the 
opinions of all those wished! to respond. On the other hand, a survey can 
be employed to focus on a particular segment of the community. In the 
Maine program, the Benchmark Survey was confined to a representative 
sample of high school teachers, and sought their views on the performance 
levels which could be reasonably expected of Maine eleventh-graders. 

The public meetings and surveys described above were connected with 
various aspects of program design and development. Another task to which 
members of the coninunity might contribute is that of remediation. In New 
York City, volunteers were recruited and trained as tutors for deficient 
students. Such a measure has the added advantage of supplying students 
with individual remedial instruction at a relatively low cost. Remedia- 
tion offers opportunities for involving certain other constituencies with 
an Interest in the program. Parents, qf .course, have an obvious interest. 
In a child'^ success, and many programs require parental participation in 
the design of remedial programs for a student who has failed the minimum 
competency test. Parents who are involved in this fashion may well be 
more receptive to suggestions as to how they may help their children to 
achieve mastery in the required competencies. 

The Detroit Program Manual suggests peer tutoring as one way of meet- 
ing the remediation needs of an MCT program. Students with demonstrated 
competency may be able to assist their contemporaries to acquire the skills 
needed for mastery, and deficient students may respond more positively to 
instruction from a fellow student. Program directors will know best which 
strategies are appropriate for use in their own programs. 



ERIC 



-146- 



I5f 



C295 



Costs 



The-costs of a program will be dependent upon the components of the 
plan. Por example, a program may be designed as part of a larger compe- 
tenc^based educational program. In which new curricula are developed, or 
It may entail the use of a single, commercially available test. Obviously 
,the costs for each program will differ greatly.- Therefore, It may be most 
useful simply to characterize the various kinds of costs common to most 
programs. 

Alraslan, Madaus, Pedulla, and Newton (1979) discuss costs associated 
with MCT programs under four different categories: program development, 
test administration, consequences of the program, and Intangibles relating 
to acceptance of the program. The following discussion has adopted the 
first three of these categories In order to present the material systemat- 
ically. 



Program Development 

This category covers start-up costs. They occur only once In a pro- 
gram; however, if the program is constantly refined, these costs may have 
thtir counterparts In the maintenance costs of the program. 

Planning . These are larjgely personnel costs and at the local school 
district level may be absorbed In regular salary time by the reallocation 
of staff efforts. However, the more complex the program, the higher the 
cost, for st^ff because a complex program will require more staff time for 
planning. 

Identification of competencies and development of competency state- 
ments' : — In most programs tHs process win involve input from educators 
and connunlty members. Time of the program staff is necessary to coordi- 
nate the activities of the many peopltj serving on the advisory committees 
and working committees which are usua'ily Involved in the development of 
competencies. Because the competencies are the hasis for later develop- 
ment, their identification may require a -elatlvvOy large amount of staff 
time« 



-147- 



er|c ' i 55 



C295 



Development of curricula or matching coin6etenc1es to existing currU 
cula . .This Includes the alignment of instruction witn the nCT program* 
Tfcurrlcular development or modification Is planned, significant costs 
• miy be necessary to fund staff time for the ^levelopmentMof inaterlals, 
duplication, and secretarial support. 



Program dissemination . Supplying Information to the community and 
the staff of the school may be one of the requirements of the program. 
Staff time will be necessary to write the notices and reports. Printing 
and distribution Is directly proportional to' the number of persons con- 
tacted. Nonprint media may be much more expenslve^to develop but have a 
lower distribution cost. If radio and telev1s1oi^?at|ons will contribute 
the time. 



In-service education . Staff members may need training In developing/ 
selecting competencies, interpreting and use test results, and in planning 
instruction to align their teaching with the competencies. ' Costs may be 
separated into preparation of materials (professional staff, secretarial 
support and printing) and costs for the presentation (presenter and parti - 
' cipant time). ' 

Test selection/development . More staff time is necessary to develop 
than to select a test. A connerclal test, however. Involves the cost of 
buying copies of the test for each administration. Either development or 
selection will require staff time to consider the content appropriate to 
the test, and to review the test with committees. If the test is devel- 
oped, added staff time will be necessary for writing items, editing itehis, 
pilot testing items, revising items, and producing the final copy of the 
test for duplication. Supplies, the duplication of materials for review 
and pilot testing, support for the analysis of pilot testing, and secre- 
*' tarial support for the preparation of drafts and final copy will also be 
necessary. 



Testing 



After the test has been selected or developed, a number of costs will 
be repeated at each administration. These costs are stable from year to 
' year, except for increases due to inflation, and, therefore, predictable. 



-148- 



ERIC 




C295 

s * 

4f 

Test administration and scoring . Space and time allocation, test 
admlnhtraiors, test printing or purchasing, test security, test distribu- 
tion and collection, and test scoring all require an estimation of costs. 



Reporting of test results . Preparing and writing the^reports of test 
results, whether computer-based or narrative, for the student, the parent, 
the media, and for Instructional staff will result In expenses for staff 
time, secretarial time, printing, an^ distribution. Computer programming, 
computer time, and consultant time will add to the expense If the agency 
feels It needs these resources. . 



Provisions for special students . Students with special needs and 
limited EngH^h-speaklna students may entail additional costs. If the pro- 
gram decides to offer alternative assessment strategies for these students. 



Consequences of the Program 



Instructional Implications of the testing results . The available 
resources of money, teacher time, and instructlonaT materials will deter- 
mine the number of students served and the nature of a remediation or 
alternative Instructional program. 



LI tlqati on Since an MCT program^ f^uses on student .performance, some 
lawsuits have been filed with respect to the legal grounds of such a pro- 
gram and Its policies. Contingency planning for the costs of^staff time 
and legal services In this connection may be necessary. 



Dissemination . Test results are important to the public in their 
function both as parents and as taxpayers. Expenditures which may be 
Involved in the dissemination efforts of a program are discussed in detail 
in the next. chapter. 



********** 



Irt addition to the discussion in this chapter, a monograph published 
by the U.S. Office of Education, titled The Resource Approach to the 
Analysis of Educational Project Cost , presents a model which Is based on 



149 



ir 




the resources ijecessary to operate a program and which may t^e used to com- 

Eare different configurations of a project In different locations. It may 
e useful to make a preliminary estimate of costs based on the Information 
provided by existing programs at district and state levels. • \Other arti- 
cles which provide general information about costs are Anderspn (1977j and 
Miller (1978). 




C295 



References 



Alraslan, P., Madaus, 6., Pedulla, J., & Newton, K. B. Costs 1n minimal 
competency testing programs. In P. Alraslan, G. Madaus, & J. Pedulla 
(Eds.), Minimal competency testing . Englewood Cliffs, New Jersey: 
Educational Technology Publications, 1979. 

Anderson, B. The costs of legislated minimal competency requirements . 
St. Louis, Hissourlr CCWEL, W7. '' 

Njrlc^ll, H. M. Seven key notes on minimal competency testing. In B. S. 
Miller (Ed.), Minimum competency testing; A report of fmr regional 
conferences . St. Loul^, Missouri: CEMREL, i9/8. . 

California, State Department of Education. ^ Proficiency assessment In 
Cali fornia: A status report . Sacramento, California: Author, 
1579: 

Communication Technology Corporation. Project director's manual . 
Moreton, New Jerseyt Author, 1976. 

Dal ton, S. Commerci al versus school -dlstrlct-made tests^ to measure 
mlrflmal-proficlendes for high school graduation . Riverside, 
Callfprnia: Riverside Unified School District,. 1978. 

• * • .. 

Gourley, R. N. Learning to manage a minimum competency te sting program. . 
Paper presented at the meeting of the American Association of CoT- 
leges^ for Teacher Education, March 1979. . 

Illinois, State Office of Education. Establ 1 shi ng educat 1 onal Pr 1 or 1 11 es 
. thr ough the Illinois Problems Index: User's manual . Springfield, 
Illinois: Author, 1^77. 

Miller, B. S. (Ed.) Minimum competency testing: A report of four regional 
conf2rences. St. Louis, Missouri: CEMREL, 19/B. ^ 



-151- 



erJc l5f) 



C295 " ■) 



New Yofk, Stati Education Department, Reagents examination manual . . 
Albany.ilew Yqrk:. Author, 1976. , 

/ ^ ■ ■■■■■ • , " 

, Ohi 0, Stat^^ Department of Educat 1 on . Competency handbook . Col umbus , 
Ohld:)^. Author, 1^78. 

/ ' 

The resource approach to analysis of educational project cost (Evaluatio 

Irv' Education Monograph .No. 3). Washington, O.C.: . U.S. Government 



printing Office, 



Shipif'd, L. A. A method for evaluating assessment . Paper presented 
fat the sixth- annual Conference on, Large-scale Assessment, Boulder, 
/Colorado, June 1976. 



'f 



ERIC 



-152- 



C295 



\ 

s 

\ * 

. ■• . * ■'■ :^ 

^ CHAPTER 7 



V 

OISSEMIKATION 



Peter E.- Schrlber and William Phillip Gorth 



Introduction 



Purpose ' 

\ 

This chapter discusses Issues and techniques relevantno preparing a 
dissemination plan for a minimum competency testing program. The Issues 
and techniques are those Identified In the survey of 31 state and 20 local 
MCT programs; the discussion Is based upon Interviews with program Planners 
and administrators as well as an analysis of program "'a^^lfls. In addi- 
tion, the writings of other professionals In educations were used to high- 
light key points. 

The considerations and suggestions presented below are neither exhaus- 
tive with respect to the general topic of dissemination nor are they pre- 
script ve in Sature. The discussion Is <11rected[towards Program planners 
and presents examples of considerations and practices they may wish to 
consider In developing a dissemination strateoy or an MCT program. Exist- 
ing ir^raSs and p?ogf^ materials are cited Illustrate specific points. 

the chapter Is organized In the following way. The bas^c 5l«"5"5?, 
involved in the planning process for dissemination, are P^^^ented and dis- 
cussed first, with examples of ways In which, such ,a plan may be documented 
concluding the chapter, ^the significant outcomes pf the Planning process 
• are the Identification and selection of appropriate media by which dissem- 
ination of Information aDout an MCT prograih Is to take place. While the, 
discussions of these outcomes appearsr late In this chapter. It essentially 
serves as the justification for the earlier discussion. 



ERIC 



^ 161 



\ 

\ 



C295 



The Planning Process 



Program planners suggested that dissemination activities be carefully 
planned and executed iri order to maximize effectiveness* Since a major 
purpose for dissemination Is to promote awareness and gain acceptance and 
support for an MCT program, a poor dissemination effort may result In 
strained cormiunlty relations, mis understanding by special Interest groups 
both^n the community and on the school staff, and loss of comnunlty trust 
In the schools. The following subsections discuss Issues raised by pro- 
gram jjlanners, as well as those found In program materials. 



Identifying the Purposes for Dissemination ' * 

It is assumed that the HCT program is necessary, endorsed by the 
schools In the district ,(or, Jf statewide, by the districts in^t he state), 
and so designed as to achieve itl objectives. In general, it has been 
found that, if the program Initiators and implementers are not behind an 
MCT-program, the ?^d is sani nation effort is likely to be_^of little use. For 
instance, the Charlotte-Mecklenburg, North Carolina, district carefully 
planned approaches for comnunlty awareness and involvement through public 
media and other comnunlty outreach efforts from the very inception of its 
MCT program. This dissemination effort was and continues to be an Impor- 
tant activity of the managerial staff of the program. In general, dissemr 
1 nation activities are an integral part of an MCT program and, as such, 
..require as much thorough consideration and planning as other program com- 
ppnents. 

Discussions of the general purposes and principles of dissemination 
may be found in various materials. The California Department of Education 
in its Technical Assistance Guide stresses the Importance of Jjsing dissem- 
Ination to promote cornnunity involvement. While the National School Public 
Relations Association publ1shes..a booklet on this topic. Goals of diss em- 
ination identified in this publication, as well as in the California mate- 
rial. Include to inform, to gain acceptance pr compliance, to obtain. ■ 
support, cooperation or participation, or to-encourage the. use of results. 

Since the first three purposes typically require Increasing degrees 
of Involvement on the part of the target audience, each succeeding one may 
require more effort to accomplish. 

/■ 



C29S 



Identifying ifvoes of Information to be Communicated 

California administrator suggest that a critical component of a 
dissemination plan is usually a specification of the tvpes of Information 
to be disseminated, and as a result, that program developers consider the 
kinds of Information various audlencles may be Interested In. Conse** 
quently. It may be advisable to start by compiling a complete list and by 
organizing this Information Tist In such a way as to achieve an overall . 
view of the dlssenfl nation plan. This will help to .Identify and remedy any 
major gaps that may be apparent In the plan. 

It may also be helpful-to prepare a detailed description of the MCT 
program, complete with the rationale for each component and..procedure, for 
eventual coimiunl cation -to different school and coninunlty -audiences (Hubbell 
& Stech). . . ^ ^ ■ ^ 

Tdentlfyinu aspects of .the program . Among the Issues typically con- 
^sldered by program planners in determining what to disseminate are the 
amount and the nature of the Information to be provided. "Current MCT pro- 
grams across the country generally recognize that the purpose of a dissem- 
ination effort Is to present a coherertt view of a well -designed and well - 
conceived program with clearly expressed goal^ vhlch do not discriminate 
against any group. The major aspects of MCT programs are Identified below 
In the form of a- checklist ^hlch may be useful to Include in a dissemina- 
tion plan. The. specific details of the plan for a local or staterlevel 
program may be completed as the planner se<^s fit. 



(1) . Program name 

(2) Policy history 

(3) Program goals/purposes 

(4) Competencies, ^ . 

(5) Standards of performance 

(6) Target groups and testing schedules 

(7) Test Instruments 

(8) Test administration 

(9) Use of test results 



ur 



Determining the types of test results . Test results can be generated 
and reported in many formats and in various forms of descriptive statis- 
tics. Understanding these results and the different modes of presentation 
for these results is often a problem tor the disseminators as vyell as the 
intended audiences. A thorough discussion of the types of results that 
can be prepared for particular tests and audiences is beyond t^e scope of 



-155- 



C295 . 



this document. However, being aware of the Importance of test resell ts and 
of their Impact on various audiences will facilitate the planning for 
dissemination. Test results are the d«ta most easily and most often mis- 
understood in any program which Involves testing (Hubbell i Stech), One 
Important reason for this is that -numbers, scores, and statistics may be 
reported either 'withouV-suffitient explahatiiJft or w1t=hout sufficient, know- 
ledge, of the level of understanding that each audience will need in order 
to assimilate the information in the manner prescr^lbed in the dissemina- 
tion plan, • * . . 

A discussion of alternative methods of presenti ng test results to 
various audiences and for the use df test results in instructional diag- 
nosis and planning is presented in another chapter of this document.* 



Identifying Kev^f get Audience s " " ' 

- . The MCT programs represented ^in the study disseminate a wide variety 
of information to a wide variety of target audiences. In general, it has 
.been found that a well-planned strategy will identify these Audiences and 
-select the informatidn and the dissemination method appropriate to each. 
To assist in this task, some MCT programs, such as California and Florida, 
have asked the following questions: » . 

' ' ■ • ■) . 

— What are the audiences and who are their members? 
. What are the critical concerns of each audtence? 

— What is the perspective of each *in understanding or dealing with 
.the MCT program? - * ' 

9 

— What information^'must be presented-to each audience and for what 
purposes? , - ^ ^\ • 

. — How will critical concerns be faced? r 

' In planning a strategy to answer these questions, there are specific 
issues and guidelines which help to add focus. The discussion which 
follows is based on discussions with program personnel and on 'treatments 
of the issues by Hubbell and Stech in a publication of the tolorado Depart- 
*ment of Education. Critical issues and decisions will be highlighted and 
potential problems in the dissemination effort wi^Tl be identified. 



-156- 

ERIC 



C295 



As 'an Introduction to the discussion, It may be useful to cons1der_ , . 
the various audiences for Information from which support may be needed. 
There are many audiences, and It may be important to Identify as many 
groups and key Incllvlduals as possible. For Instance, special Interest 
groups may have particular and potentlaTly troublesome concerns about the 
MCT program. It could be crucial to Identify each of these groups and to 
anticipate Us concerns, since the support of special Interest groups may 
make a program. S4m11arly, the absence or withdrawal of this support can 
break It. 

Some audiences may become more Involved or concerned over time. An 
MCT program can rUn for a considerable length of time, and may even become 
*a permanent program. Just as the program may be modified and revised over 
time, so audiences will change In composition and particular Interests. 
NeM soeclal Interest groups may emerge. For example, local businessmen may 
come to depend on test results In hiring high school graduates. Parents 
who^ children are preschoolers at the outset of a program will take a 
greater Interest" when their children participate In the program. 

There are certain subgroups which may require special attention In 
the dissemination, pi an. It should perhaps be kept In mind that such an 
audience need not b,e large to be essential to the success of the program. 
For example, the town council and local labor unions can be small In size 
"jbut extremely Influential. Neighborhoods with predominantly non-English- 
speaking residents may need special consideration. A language or sodo- 
cultural barrier may mean that a special effort Is necessary to keep all 
the people In a comnunlty fully Informed and to keep communication prob- 
lems at a minimum. \ 

In Identifying all pertinent groups. It Is important to recognize 
that .there are groups within the school system Itself which are also 
potential target audiences: teachers, students, and school administrators. 

Since an MCT program is very likely to be labeled an "assessment" 
program, it is also likely that many audiences will have a personal and 
emotional Interest in such a program. These groups may feel that their 
student manbers will be stignatized or discriminated against as a result 
of their performance in the program. 

It is useful to remember that the MCT program is essentially for 
everyone's benefit. But, since it is a testing program, it will identif./ 
students as deficient. And, as any program, it requires tax money for its 
support. These two facts alone may generate negative feelings which a 
cursory or half-hearted dissemination effort will do little to allay. 



-157- 



C295 . _ 

' • " • ■ \ 

Identifying Audience Concerns and Goals of Dissemination 

The following list presents a sampling, of categories of target 
audiences^ with a brief characterization of the typical concerns and the- 
goals planners might set In developing a dissemination plan. Proarams 1n . 
which some or all of the audiences listed below have been Identified and 
addressed through various media Include,. for example, Mlchljan, Califor- 
nia, Florida, and North Carolina. Program materials from Florida and 
North Carolina Include pamphlets to students dealing with some of the 
concerns listed below. Michigan educators focussed on a number of audi- 
ences, including district administrators, in developing dissemination 
materials, while California, in its Technical As sistance Guide, describes 

. how an administrator might address the concerns of the community in pre-/ 
senting assessment results. The discussion belpw, then, is based upon ^ 
interviews with planners as well as an analysis of program materials. ..For 
other discussions of the same topic, the reader is referred to Hubbell and 

' Stech (n.d.) and NSPRA (1976), 



(1) IN -SCHOOL AUDIENCES. The four major categories are: students, 
teachers, admin istr.ators, and boards of education. 

c 

(a) Students 

Concerns : Consequences of poor test performance are 
usually the chief sources fdr <:-oncern. Questions most 
frequently asked are: | 

— What happens to me if I fail? 

I 

— Do I get behind In other courses if I am assigned to 
remediation? 

— Is there a stigma attached to being in a remedial 
group? 



Dissemination goals : Gaining student acceptance and allay 
ing their fears are foremost. One approach which may be 
employed is to help students understand that remediation 
will make them more employable apd better prepared to face 
life after graduation. Planners may also want to give con 
siderable thought to the means by which passing or falling 
scores are reported. 



-158- 

ERIC 



I 



C295 ' . , 

(b)NTeachers 

Corkerns ; Teachers may feel that the program will add 
to their workloads. Some may also feel that differences 
In test scores among classes will lead to evaluations of 
teacher performance. Other concerns are the effect of the 
program on the students and potential currlcular changes. 
Questions may Include: 

— What additional duties will be expected of me? 

— Will the administration rate..me as a teacher on the 
basis of my students' test scores? 

— Will the program be beneficial to students? 

J.— Will the curriculum be changed? Should the curriculum 
be changed? 



Dissemination goals : If the program is a local one, many 
of these concerns may be addressed by encouraging teachers 
to take an active part In the formation of. the program. It 
is common knowledge that a program has a better chance for 
success if the participants have planned and developed the 
program themselves. Thus, beyond mere acceptance of the 
program, teachers may be more supportive of the program if 
they are active participants. 



(c) Administrators 

Concerns : These will vary from administrator to adminis- 
trator. Primary concerns may Include a loss of operating 
funds due to the fact that program needs have received 
priority, extra work involved In organizing staff for pro- 
gram implementation, and impact on the daily school routine 
of program components which must be scheduled. The admin- 
istrator may also be concerned about comparisons of schools 
based on test performance, about the administrator's role 
in directing program components, and about increased com- 
munity concern translated into more frequent requests for 
information directed to the administrator. Questions 
Include: 



-159- 



ERic is: 



\ 



— will I lose funds for my school because of money put 
Into "the program? 

— wm I have extra work to do 1n terms of planning for ^ 
testing (or for cutTlcular ,or test development)? 

wm my staff have extra duties to perform? 

Dissemination goals ; Gaining acceptance and obtaining, 
cooperation and support are considered to be key aoals in 
terms of having every school in a district participate 
equally. District-level and^ bull ding- level administrators 
may require thorough briefing on their roles and the roles 
of their Instructional staff In MCT program development and 
Implementation. As with teachers, active participation may 
foster cooperation and support. The California State 
Department of Fducatlon, In Its handbook for local school 
participation In MCT programs, delineates roles for admin- 
istrators within the MCT programs such as program monitor- 
ing. Involvement In standard setting, and establishing 
remedl atl on courses and al ternati ves . 



(d) Boards of education 

Concerns : The greatest concern Is community Impact. Ques- 
tlons include: 

— Will the community provide positive support? 

— Will the program better prepare students for life after 
graduation? 

-T What special. Interest groups may respond negatively? 



Dissemination goals : The board of « education may need to 
be Involved from the Inception of a local district program. 
The board Is very likely to expect Information about the 
dissemination efforts planned for the other target audi- 
ences . 



-160- 



COMMUNITY AUDIENCES. These audiences are parents, residents 
without school-age children, business groups, and special 
Interest groups. 



(a) Parents 

Concerns : Their main concern Is generally for the effect 
the program will have upon their children. This concern 
Is often manifested as fear or anxiety that the MCT program 
will single out for failure the students with learning 
problems and other disabilities. Tjiie way In which the 
Issue of parental concern Is handled can play a significant 
role In determining the success of a program. Questions 
Include: 

— What are the crUerIa for passing or falling? 

— Will my child get special attention If he/she doesn't 
pass the test? 

— Will a child who falls be singled out and stigmatized? 

— Will the program focus on weaknesses In school academic 
programs? 

Dissemination goals : Program directors agree that parental 
Involvement In and support of the program Is essential for 
its success. Parents need to know why the program has been 
• Initiated, what It will test, and why. One device commonly 
used by many local districts 1s a parent council to review 
program content so that parental understanding of the con- 
tent of a program Is maximized. The Michigan Educational 
Assessment Program adopted another approach and produced 
several quest ion- and- answer newsletters and brochures for 
the general community. One, entitled A Pamphlet for 
Parents , is directed solely at parents and describes the 
program, lists sample objectives, and provides information 
about the standard of performance. Careful attention to 
providing information on how students with failing test 
scores are treated and remediated is Important in every 
current program. ^^ 



-161- 



Residents without children In school 

Concerns ; These people may be members of several different 
target audiences. They can be childless couples, parents 
with preschoolers, parents with grown children, and elderly 
or fixed- Income people. Their concerns may range from the 
amount of tax money necessary to the Impact of the MCT pro- 
gram on the community. Questions Include: 

— Will taxes go up? 

— What good Is more testing? 

— How will students be bettef educated because of the pro- 
gram? 

Will the program reduce the number of graduates who are 
functionally Illiterate or unable to perform simple 
arithmetic calculations? 



Dissemination goals : The general purpose Is to gain com- 
munlty acceptance and support. Fears about Increased taxes 
may need to be allayed. Orie possible approach Is to clearly 
describe the benefits of the program to the community. The 
American Friends Service has put out a guide for the general 
oonmunlty, entitled A Citizen's Introduction to Minimum Com- 
petency Programs for Students , which describes succinctly 
and clearly what cUlzens look for In developing and evalu- 
ating MCT programs. 

Employers and business organizations - 

Concerns : A chief concern Is whether the program will pre- 
pare graduates better for entrance- level occupations which 
require only a high school diploma. The most frequent 
question Is: 

— Will the students passing the test make better employees? 



Dissemination goals : Acceptance and support of the program 
•may be facilitated by showing a connection between school 
preparation and success on the job. 



-162- 



f 



t 



C295 



(d) Special Interest groups 

Concerns ; These groups may Include trade unions, soclo- 
cultural neighborhoods, ethnic- Identity groups, or equal - 
rights groups. A major concern of such groups Is the 
possibility that the MCT program may discriminate against 
members of the group. If a particular group has a dispro- 
portionately large proportion of students who baye failed 
the MCT tests, then to achieve success, the MCff program 
win need to engage the support of the parents'of the defi- 
cient students In order to help such students to partici- 
pate and succeed In the appropriate«remed1al programs. 
Failure to meet this Issue might lead to charges of dls- 
. crimination, because the special Interest group may feel 
that the MCT program 1s designed only to label Its student 
members as deficient; the group may need special attention 
to see that its members understand the function of th| 
remedial component of MCT as well. Questions include: 

— Win failure on the test reinforce a student's negative 
feelings? 

— W111 the program stigmatize minority groups? 

~ Why 1s the program good 1f students are rated by test 
scores? 



Dissemination goals ; Gaining acceptance Is a f1rst-level 
goal. Cooperation and support would be greater If any of 
the members of the special interest groups have children 
who will be involved in the MCT program. One apparently 
effective approach to this problem is Illustrated by the 
comnunity outreach program implemented In the MCT program 
of Charlotte-Mecklenburg, North Carolina. As part of its 
remediation effort, the district has organized after-school 
community tutoring centers in disadvantaged neighborhoods 
and has Initiated a door-to-door outreach effort to Inform 
parents of students who fail the MCT test about the remedi- 
ation program and its value to their children. 



(3) OTHER AUDIENCES. Important audiences that may go beyond the 
boundaries of a comnunity are the news media and education asso- 
ci ati ons . 




-163- 



171 



f 



I 

1 . 

1 i 



(a) News media 

Concerns : The news media taiy be the special sroup which 
requires the most careful dissemination effort of all. The 
manbers of this group may see tM program as an additional 
source for news, and test scores as Interesting reading; 
unless there Is good rapportUjetween the district (or the . 
state) and the media, anything which Is controversial may 
V be emphasized at the expense of the goaU and successes 

of the program. Questions from the media may Include: 

— What Is a good test score? 

— What Is the rationale behind the program? 
. ' — Are the program goals realistic? 

— What Is the response of various Interested groups? 

— What are the consequences to students? 

— Is the program well conceived and Implemented? 



Dissemination goals ; Many program contacts agreed that 
good press support can be essential to make sure that 
incorrect or distorted Information Is kept to a minimum, 
and to gain public support of the program. For a descrlp^ 
tlon of ways administrators might present Information to 
media representatives as well as a sample news release, see 
California (1977). Many programs plan several news confer- 
ences and even hold public question- and- answer sessions on 
television (as, for examplie, the Charlotte-Mecklenburg, 
North Carolina district). 'The Florida State Department of 
Education invited 37 representatives of the news media to 
take one of Its minimum competency tests and then to write 
stories about their Impressions of the test (Fisher, 1978). 



(b) Education associations 

Concerns : Education associations, including unions, may 
view the MCT program as a threat to the teacher— in terms 
of the extra, generally uncompensated work the program may 
require, and In terras of the potential for teacher evalua- 
tion which may be based unfairly on student test results. 




-164- 



ERIC 



C29S 



Such groups may ask whether the program really has been 
designed to aid students or to provide a cosmetic and 
superficial means of satisfying-^ the community concern fpr 
accountability t.n education. Questions include: 

— Will the program really help students? 

— Will the teacher face an unfair burden of extra work? 

— Will any part of the prograhi evaluate a teacher's per- 
formance? 



Dissemination goals ; If the MCT program Is statewide, 
St ate- level education organizations can be key target audi- 
ences, ai.d the goal of dissemination might be to gain pro- 
gram acceptance. A local program may have to win over the 
local teacher associations land unions. If teacher support 
has been fostered at the grass-roots level, then organiza- 
tional support or acceptance may be easier to obtain. 



Identifying Resources for Supporting Diss emination 



*. Dissemination Is a large task which may require a substantial commit- 
ment of time and money. Two critical tasks for a dissemination plan are 
to determine the message and to get it across to the '^iQht audience. The 
chief resources of a dissemination effort are its personnel and the means 
•available for reaching the various audiences. 



Personnel . In planning for dissemination, nvolving key PeopleJ^o"' 
the earliest planning stages of the MCT program (Hubbell & Stech; NSPRA, 
1976) can be important. These persons are most likeiy to be extremely 
familiar with the MCT program and in close touch with the pol tical leader 
ship of the community. It might be useful if the release of information 
1s monitored by the state or local district administrative leadership. 
Public relations always play a large part in the operation of any educa- 
tional agency. The personnel in charge of dissemination can enhance tne 
chances for the success of a program if they are experienced and of suffi- 
cient stature to command respect of any groups or constituents they may 
have to address. 



-165- 



C29S 



Often dissemination 1s a team effort. At the state level this team 
may jconslst of State Education Department public relations staff and MCT 
program staff. At the local district level, such a team might Include the 
superintendent or a designate, the MCT program director, and guidance 
staff— all persons experienced In community interaction within the school 
district. 

Materials and funding . Althouah It 1s Impossible to develop a 
formula for the i^unding of a dissemination effort, two useful activities 
are allowing for the allocation of funds and setting up a budget for this 
purpose (NSPRA, 1976). In this connection, a major factor to consider may 
be the nature and the amount of effort necessary to ensure adequate accep- 
tance and/ or support of the program from the target audiences. 

To offset the large costs of dissemination and to handle Its logis- 
tics problems, the multiplier effect and donated resources are frequently 
used both by state Departments of Educ^ltlon and by local districts. The 
multiplier effect refers to the dissemination of Information to ai group 
whose members In their turn make a similar or prescribed dissemination to 
other groups. At the state level this may entail training "trainers" at 
regional levels who will then visit the local districts and Individual 
target audiences as part of a statewide dissemination effort. At the 
local level, a presentation (with handouts or packets of background Infor- 
mation) can be an effective way to reach the executive committee of an 
organization or the leaders of a targeted group. If the presentation Is 
successful, these Individuals can then make their own presentations or 
endorsements to their constituencies. In this way, it will be possible to 
contact a large number of people with little cost and effort. Such pre- 
sentations by the leaders of a group or organization will further enhance 
the paJsUlve effects of dissemination. Donated resources may be In the 
form of free exposure by the media: newspapers, radio, television, com- 
munity newsletters. The audience reached can be enormous; donated 
resources, therefore, are an important consideration for every dissemina- 
tion plan. 



Identifying Appropriate Media for Conveying Information 

As evidenced in operating MCT programs, the medium and the fonnat for 
carrying the information to an audience are important aspects of the 
dissemination plan. A key parameter in selecting or using a medium is the 



-166- 



ERIC 



171 



C295 



amount and nature of the resources avail abte^WSPRA, 1976); theriefore, - - 
which media are selected by any given program win depend upon the parti- 
cular circumstances within that program. 

Available means for dissemination . In selecting the means for 
dissemination, planners may wish to consider the folTowIng two questions: 

(1) which means will reach each target audience most satisfactorily, and 

(2) what resources are available to support the means. 

A brief discussion of the means of conveying Information appears 
below. All- are familiar; whait may be unfamiliar to those Inexperienced In 
a large-scale dissemination effort Is the careful planning necessary to 
select the most appropriate means for each audience, so that the Intended 
Information Is transmitted anil the Intended effect of that Information Is 
achieved. An Inappropriate choice may be a waste of time, effort, and 
money, and may produce an adverse Impact as well. 



(1) CONFERENCES/WORKSHOPS. Generally these work best when the 
participation of the audience Is desired for program develop- 
ment. Implementation, or evaluation. Therefore, man v programs 
use advisory or steering committees composed of local community 
members. Teachers come to workshops to learn how to develop the 
competency statements, prepare tests, or Interpret and use test 
results. In South Burlington, Vermont, for example, a group of 
teachers attended a workshop during the summer of 1977 at which 
they developed assessments for the state-mandated competencies. 
Parents and other comnunity groups may also be invited to help 
in reviewing the competencies or in setting the passing stan- 
dards for the tests. However, to use the workshop or conference 
most effectively, it is often necessary to use other avenues for 
diffusing irtformation in order to make audiences aware of the 
MCT program and to persuade them to participate. 



(2) PUBLIC MEETINGS. The public forum can be very useful, providing 
the disseminator knows and understands fully the^intent of the 
meeting and the composition of the attending audience before the 
presentation (NSPRA, 1976). For example, many public. meet inqs 
relating to school or community affairs are attended by people 
who wish to participate in the decision making process. This 
participation may take the form of the comnunity or group pres- 
sure which their presence can effect. In Oregon, for example. 



-167- 



RIC 



175 



the superintendent called for public meetings to be held state- 
wide; the purpose of__the mee tings was, to gauge public sentiment 
concerning what 'skill? public schools should be responsible for 
teaching. A public meeting which has been called for the pur- 
pose of presenting information about the MCT program usually 
guarantees an Interested audience. In Massachusetts public 
meetings are sponsored by the Department of Education In order 
to present and answer questions concerning! the results of basic 
skills tests. In such meetings, the audience may consist 
largely of people who are very supportive Of the school system 
and. It Is to be hoped, of the program; at the other extreme. It 
may consist^ of people with negative attitudes toward the school 
system tfi'general or toward the program 4tself. Thus, the 
disseminator should be prepared to cover the full gamut of pos- 
sible reactions and queries In cdnnectton with the program. A 
second possibility for disseminating. Information at a public 
meeting Is "piggybacking," oi* adding a dissemination effort to a 
meeting set for a different purpose: a school board meeting to 
discuss a controversial budget, or a town meeting to select and 
discuss political candidates. Many MCT program directors warn 
Of the.^anger of Inadvertently associating the program with 
other, perhaps emotionally laden Issues and controversies on the 
regular agenda of a meeting. A meeting of people who have come 
toAvote down a district budget, for example, may not be the 
appropriate occasion for disseminating information about a pro- 
graii which can Itself arouse strong feetlngs. 

NEWSLETTERS/FLYERS/BROCHURES. Newsletters can be a very Inex- 
pensive means of Informing the coirmunlty of the MCT program and 
of keeping people Informed of program progress. This Is, parti- 
cularity true if a newsletter Is the regular periodical of a 
school^ district, since the costs of adding the MCT program 
descrl lotion win be relatively small. Nearly every MCT program 
uses this method. Flyers and brochures in a quest1on-and-answ«? 
format have been found to be particularly useful. The Michigan 
Department of Education has produced several such flyers and 
brochures, as did the North Carolina Department of Education. 



NEWSPAPER ARTICLES. This medium generally reaches the largest 
number of peopU. Unfortunately, it can be the hardest to con- 
trol In terms of accuracy and emphasis. Hubbell and Stech and 
the California Department of Education (1977) provide guidelines 
and approaches for achieving good media releases and interviews. 
Program planjiers may also want to consider Florida's use of 



-168- 



f. 



0' 



C295 . ^ 

■ newspaper reporting 1n Its program (Fisher, 1978)^ .The Depart- , 
ment permitted a small number of reporters to take the eleventh- 
grade Functional Literacy Test In order to foster greWer 
understanding of the content and difficulty of the test among 
the public. In contacting the press, however i , program. personnel 

• have found that material for publication may >ieed to be reviewed 
carefully by the dissemination staff. Misinformation, once in 

""print, may produce an effect which Is dlfj^cult to. overcome. 

(5) VelEVISION broadcasts. Because- of the hlgh^ visual Impact and 
wide exposure which this medium provides, It n Important to 
□resent spokespersons who will appeal to particular audiences. 
For example, a good review of the MCT program by the spokesper- 
son of a special interest group may bie most beneficial In gain- 
ing that group's acceptance and s.upport. 

(6) RADIO 'broadcasts. There are' generally two modes of radio pre- 
sentation. One metKod Is to hav6 a newscaster pr^en.t a capsule 
summary, perhaps periodically, of the MCT program. A cons$cu- 
tlve set of presentations every day for a week may^ reach a 
diverse set of audiences. A second mode of dissemination miant, 

^ be In the form of d discussion or Questlon-and-answer forum In 
which key school staff (or state-level staff) meet H th a com- 

• mentator or with the spokespersons of key target audiences to 
answer their questions and respond to their concerns. For 

.' example. Florida Department of Education staff members were 
Interviewed by representatives of national radio networks to 
answer questions concerning the assessment pcogram. 

(7) WRITTEN SUMMARY REPORTS. Written reports «;e usually directed 
to a specialized audience, since Interest wll have to be at a 

' fairly high level to ensure that the report will be read. How- 
ever, reports take on added value as background ^"^o^jjj^o" . . 
packets for use In multlpller-effect situations and with donated 
sources. In surveying summary reports for existing MCT^pr-o- . 
grams, those programs with a recordxof success invariably have 
produced well -written reports In-lan^age which the general 
reader can understand. Such reports often emphasize program^ 
components and their rationales, topics which jre^known to be of • 
Interest and concern to a variety of public audiences. Michigan 
Is one of a number of programs, for example, which prepares-. . 
summary reports for various aroups. Including classroom 
.. ^ teachers, parents, and school admin ist»ators. 



ERIC 



-169- 



6 



C295 



(8) mum TECHNICAL REPORTS. ^Th1s type of report 1s aenerally 
. useful for state- or local-level planning, for local assessment 
\ of a program and the need for "modification-, and for assesslna 
the valurof each component of program planning. Implementation, 
and evaluation. Reports prepared for. current programis, such as 
the Michigan Educational Assessment Program, focus on important 
data presented In tables and charts so that a reader receives a 
comprehensive view of the program. $1nce such reports are 
generally available to the public, accuracy of content should be 
tightly monitored. These reports may ^1 so serve as resource 
majterlal for media stories. 

19) MOVIES/SLIDE-TAPE SHOWS. Program planners may find these media 
to be more appropriate for disseminating Information 1n a lively, 
topi car fashion. In. order to acquaint parents with the purpose 
of the testing program and the part they could play In strength- 
ening their children's skills, the Michigan Oeoartment of Educa^ 
tlon prepared a fllmstrlp for district use. Slide/tape shows 
represent another avenue for conveying Information, one that can 
be prepared In advance and used with many different groups. 

It Is frequently helpful to establish criteria for selecting the 
means for the diffusion o? Information. Some Important Issues which stand 
out In a review of existing programs are cost, available lead time, acces- 
sibility, and breadth of caverage and Impact. Cost Is always a prime con- 
sideration. Conferences, workshops,- flyers, and brochures can be expe^^^^^ 
due to high production and preparation' costs and the size of such projects. 
Lead time maj often mean that plann<hg will Jake place v«eks or even months 
before the Information Is expected to reach the Intended audiences. For 
example, to mount t^ > dissemination efforts, the Michigan Educational 
'Assessment Program a., the Massachusetts Basic Skills Assessment planned 
.strategy and documents months before the programs were operative. Since a 
program may change or J:est1ng may occur before the time set for 1n]t]at;"9 
the dissemination effort. It may be necessary to make readjustments 1n the 
dissemination plan. 

The accessibility of the possible meahs for communication another 
Important consideration since scheduling depends on their availability, 
costs, and the work necessary for preparation or development (as in the 
case of documents). The breadth of coverage and tj)§ expected impact of a 
particular mode of dissemf nation are two factors to consider as well. 



4 



ERIC 



-170- 



17 



C295 



The form of the message for the vehicle of dissemination . Once the 
means has been selected, the content and format of inf ormaTTon to be 
disseminated can be planned. If the method chosen 1s a workshop, then 
materials will be a consideration. Group presentations may require over-^ 
head transparencies, fllmstrlps, or handouts. The facts and summaries for 
the media may need to be carefully reviewed for accuracy, completeness, 
and Impact not only as a whole, but In the light of. the effect they will 
have as partial presentations. The choice' of language for materials may 
often present problems. On the whole, 1t has been found that the avoid- 
ance of technical terms and concepts and of educational jargon Is best. 

The form of the message Is also dependent on the circumstances In 
which the presentation of a message Is to occur. It should certainly be 
useful to keep a record that will Indicate what Information Is to be dis- 
seminated when, to whom, and how (Hubbell & Stech). 

In preparing^. plan for dissemination. It may also help to draw up 
charts with detailed descriptions in each box of the chart of the type of 
information to be disseminated, the vehicle, and the target audience(s). 
The use of planning charts will permit planners to mapTout the entire dis- 
seirtination effort in outline form so that coordination, sequencing, and 
time commitments can be easily compared and grasped at a glance. 



~ Documenting the Plan 

• * 

It can be time-consuming to prepare a comprehensive plan for dissemi 
nation with* stated procedures, anil rationales for the suggestions and 
selections made. As with the planning of the form and content of dissemi 
nation presentations for particular audiencfts, the use of charts in other 
aspects of the planning process will permit the development of planning 
components in a clear and orderly manner. Charts also provide a systema- 
tic means of organizing a great deal of information in a format which *s 
easy to understand and to explain to others. Two charts from state-level 
programs are presented here. The first is a timeline for the dissem* na- 
tion plan produced for one year of the Michigan Educational Assessment 
Program. The reader may notice that the dissemination tasks continue 
throughout the year incluoing before, during, and after the test dates. 

The second chart is from the California State Department of Educa- 
tion* It is a suggested means of producing an overall plan fcr managing 
the assessment information for local dlsw-^icts. The chart shows major 
audiences for dissemination and major sources of information.. 



ERIC 



-171- 



/?!) 



• * 

Topk 


I^onnatimi to be leperted (to) ^ 


\ / 


RaeoidkMpliv(l^) 


Studtnt 


Partnt 


Community and 
achoohb^^ 


Teachers 


Principal's de^gnee; 
e.g., counselor 


Administrator 


Standards of 
proficiency 


1. Content of law 

2. Skill areas to be 
covered 

3. Prondency levels 

• 


1. Content of law 

2. Skill areas to be 
coyered 

3. Proficiency levels 


L Content of 1|bw 

2. Skill areas td be 
covered \ 

3. Proficiency levels 

\ 

K 


Is Monitoring^f each 
^udent*s progress in 
reaching required 
levels of proficiency 


L^l^andards to be 
covered in each 
department and 
coursA 

2. Students who have 
and have not 
attained proficiency 
levels 

3. Students on special 
projects 


1. Criteria for profi* 
ciency asseument 

2. Proficiencies to be 
included 


Assessments 


\^ — 

4. Frequency 

5. Date and time ^ 

6. Individual rasults 


4. Frequency 

5. Date and time 

6. Individual results 


\ 

4. Frequency 

5. Date and time 

6. Group resftlts 

1 


2. Methods of 

evaluation 
X Specific date and 

time 
4^ Results 


4. Methods of asseu* 
ment 

5. Individual student 
results 


3. Monitoring of all 
assessments 

4. Statistical data for 
schools and districts 


Conferences 


7. Notification of 
conference 

8. Sjtatus 

9. I^eniification of 
alternatives 


7. Letter announcing 
conference 

8. Follow*up tele* 
phone call 


7. Formative and sum* 
mative da^a on 
conferences 

8. Student and parent 
reaction 


S. Participation in 
conferences 


6. File copy of confer* 
eiice notifications, 

7. FoIlow*up phone 
calls 

8. Date and time of 
conferences 

9. File copies of deci* 
sions made at confer* 
ences, including 
alternative courses 
selected 

10. Special projects 


5. Verifi(iation or com* 
pliance 

6. Provisions for alter* 
natives 


instructional 

processes; 

alternatives 


\Q. Courses available 
11. Alternatives avail* 
able 


9. Courses available 
10. Alternatives avail* 
able 


9. Courses available 
10. Alternatives avail- 
able 


6. Standards to be 
covered in their 
course and department 

7. Course alternatives 


1 1 . Standards to be 
covered in each 
course and depart* 
ment 

12. Students on special 
projects 

13. Alternatives to 
regular program. 


7. Courses of study 
and alternatives 



I 

* f'y^ofi j ^^hnical Assistance Guide for Proficiency Assessment , California, State Department of Education, 1977. 



er|c • ISI 



Figure n* - . 

Timeline for Dissemination 

Tttks April May June My Aug. Sept Oct. Nov. Dec. Jan. Feb. ^ liar. Apitt 

1. Compofa & produce teporti 
a* Objeclivas & Procedural 

b. Individual Student 

c. Classroom, School & District 
d» Technical 
e. Statewide Results 

2* Compose & print teacher cards 
3. Produce fllmstrip for all tchn« 
4» Produce interpretation Qlmstrip 
5* Produce teit*day cards 
-■6, Produce workshop foUos 
7. Conduct workshops 
8« Conduct brtefings 
9. Produce curriculum analyses 

10. Newsletter releases 

11. Report to State Board & 
Legislature 

12. Testing dates 

13. Return of data to districts 



* From Rel easing Test Scor es ; Educational As s essment Program^ How to Tell the Pub11c » 
National School Public Relations Association, 1976. 



1S2 




C29S 



Summary 



In the case of a potentially controversial program, such as an MCT 
program, the dissemination effort may require more components and consid- 
erably more planning than that necessary for the report of an occurrence 
such as a sporting event In the local papers. The specter of accountabil- 
ity may be of concern to every identifiable audience: teachers, students, 
-and-admlnfstrators as well as parents, thenews medlar and special Inter- 
est groups. 

Dissemination, then, becomes a delicate and demanding set of activi- 
ties ranging over the duration of the program. Consequently, It Is impor- 
tant, to recognize the need for comprehensive and careful planning In the 
early stages of an MCT program, so that dissemination activities can be 
fully Integrated with the other elements of the program. 



-174- 



ERIC 



133 



C295 



References 



American Friends Service Committee. A citizen's introducti on to minimum 
competency programs for students" Columbia, South Carolina: 
Southeastern Public Education program, 1978. 

• 

California, State Department of Education. Technical assistance guide for 
proficiency assessment .' Sacramento, California: Author, is//. 

Fisher, T. H. Florida^s approach to competency testing. Phi Delta Kappan. 
1978, 59(9), 599-602. 

Hubbell, N. S. (Producer). Parent help at home pays off in school. 
Michigan Educational Assessment Program Fiimstrip, is/S. 

Hubbell, N. S., & Stech, E. L. Tellino the testing story . . . through 
the mass media . Denver, Colorado: Colorado uepartment of Educanon, 
Cooperati ve Accountabi 1 1 ty > Proj ect , n.d. 

Michigan, State Department of Education. Do YOU use MEAP test results 
appropriately? Lansing, Michigan: Author, n.d. 

Michigan, State Department of Education. An educatio nal health check. 
Lansing, Michigan: Author, n.d. 



Michigan, State Department of Education, a pamphlet for parents . 
Lansing, Michigan: Author, n.d. 



Michigan, State Department of Education. Q uestions and answers about the 
Michigan Educational Assessment Progranu Lansing, Michigan: Author, 
n.d. ~~ 

National School Public Relations Association (NSPRA). Releasing test 
scores: Educational assessment program, how to tell the public. 
Arlington, Virginia: Author, 1976. 



-175- 



